the googlebot then can't crawl this and this then results in a "soft 404" (as seen in google search console) and google won't include any of the pages in your site in it's index.
The soft 404 is caused by another bug whereby if the content api call has a problem that it doesn't understand it defaults to 404 not found being rendered but with a 200 status code. (This problem in itself causes other issues since what should be a 500 error doesn't appear as such in GA or server logs.)
In addition another default rule prevents /preview images from being loaded by google bot which could cause indexing issues
Describe the bug
Default robots.txt includes the rule
If you use expansion to improve performance of you volto theme your content urls then look similar to
the googlebot then can't crawl this and this then results in a "soft 404" (as seen in google search console) and google won't include any of the pages in your site in it's index.
The soft 404 is caused by another bug whereby if the content api call has a problem that it doesn't understand it defaults to 404 not found being rendered but with a 200 status code. (This problem in itself causes other issues since what should be a 500 error doesn't appear as such in GA or server logs.)
In addition another default rule prevents /preview images from being loaded by google bot which could cause indexing issues
To Reproduce
TODO: there is perhaps a more direct way to test this by using something to simulate blocking /*? urls in the browser?
Expected behavior
Google indexes the page fine.
Screenshots
Proposed solution
Other solutions considered
Not really clear the best way forward
`Disallow /*?
rule from default robots.txt/++api++/content/@@expand/actions/breadcrumbs/navigation
expand.navigation.depth=2
?Software (please complete the following information):
Additional context
Add any other context about the problem here.