Closed johnnyporkchops closed 9 months ago
This is the main launch sitemap PR issue still has a few outstanding checklist items, so leaving open so we don't miss those and I will be the first to ceremoniously close this issue once they are complete.
Closing now that outstanding checklist item for Best-bets has an issue with google-sheet for upload to search.gov (as csv). See: https://github.com/fecgov/fec-cms/issues/5912
Summary
Launch new global sitemap configuration:
Changes to pollicy-guidance sitemaps:
Reslated: #5522
This was tested extensively on dev space. See: https://github.com/fecgov/fec-cms/pull/5735
Sitemap:
/sitemap-wagtail.xml
/guides
or/browse-data
, etc. We will put these few pages in Best-Bets --and/or-- ask search.gov to manually add them to our index (This does not include data or legal pages handled by the API/Elastic Search)sitemap_pdf.xml
has a new path for all documents:/resources/cms-contemt/documents/policy-guidance/...
sitemap_html.xml
has a new path in Wagtail for webpages :/updates/guidance-search/...
Robots.txt:
robots_prod.txt
only in production.robots.txt
,served to non-prod environments(stage. dev, feature), tells search engines to Disallow crawling those domains.urls.py
is where we control which robots file is served to which environment.The sitemaps themselves do not need to be conditionally hidden at certain environments because they always reference
www.fec.gov
, so they would not never prompt a search engine to crawl stage, dev etc, subdomains. Also because they are not namedsitemap.xml
, which is what the search engine looks for.This PR removes the logic in
fec/home/wagtail_hooks.py
andsearch/utils/search_indexing.py
that make any PUT/POST/DELETE calls to the i14y endpoint (This was called when a Wagtail user updated, created, or deleted a page).Quote from search.gov support, Arantxa wrote::
Tech steps or considerations:
[x] In Wagtail admin, in production, make sure there is a section parent
/updates/guidance-search
with all the policy-guidance html pages aliased there. Keep this page in draft (unpublished) so it does not search results. UPDATE: This page does NOT seem to not be removed from index even though it is removed from sitemao when unpublished. Need to contact search.gov for explanation/updates/guidance-search/
to/legal-resources/policy-and-other-guidance/guidance-documents/
. This way, if someone attempts to navigate to it, they will be taken to the right place. Make sure this is a redirect only for that exact path and not for children of that path. use^
and$
in regex ?[x] In production content S3 make sure there is a folder
/cms-content/documents/policy-guidance/
with all the PDFs in it[x] In
Best-Bets
in the fec_content_s3_prod affiliate, change any policy PDF's to have the new/policy-guidance/
directory in their paths.[x] Add these to
Best-Bets
for global search affilliate, if not already there: Are there any missing ? https://github.com/fecgov/fec-cms/issues/5912 Best-bets google sheet to upload as csv to Best-bets in dashboard: https://docs.google.com/spreadsheets/d/10cQij4Um_UJYJbBbkju1lYT8oB57T2mvT-FYiaksbY8/edit?usp=sharing[x] May have to adjust Pytests for robots.txt for new prod and other envs
[x] #5870
[x] Do we want to remove
search_indexing.py
since we are removing the Wagtail hooks use that utility to make CRUD requests to 114Y drawer (last bullet in summary above)? Or should we keep this for as a reference for using Python Requests, Beautiful Soup etc,?The steps below need to be coordinated to happen at same time as the deployment of the release. May also possibly require coordination with search.gov team
robots.txt
instead?robots.txt
. When we originally setup policy-guidance , we had to tell them them to start indexing these sitemaps regularly because there was no reference to them inrobots.txt
(or anywhere else).base.py
andsearch/views.py
.host: fec.gov
,port: 443
Related issues
Research sitemap generation - https://github.com/fecgov/fec-cms/issues/5499 Document search.gov sitemap - requirements https://github.com/fecgov/fec-cms/issues/5501 Changes to Policy and Guidance search before transitioning to sitemaps for global search - https://github.com/fecgov/fec-cms/issues/5595)https://github.com/fecgov/fec-cms/issues/5595 QA/tests for new Global and Policy-guidance search sitemaps: https://github.com/fecgov/fec-cms/issues/5770 Update docs https://github.com/fecgov/fec-cms/issues/5842
Related PR
[WIP] Test sitemap on dev https://github.com/fecgov/fec-cms/pull/5735
Completion criteria
Future work
Other future work that may be needed following this issue's completion.