mozilla / addons

☂ Umbrella repository for Mozilla Addons ✨
Other
127 stars 41 forks source link

Some add-on pages in the sitemap return 404s #1841

Open bobsilverberg opened 3 years ago

bobsilverberg commented 3 years ago

Looking at the coverage report at search.google.com/search-console/index/drilldown?resource_id=https%3A%2F%2Faddons.mozilla.org%2F&item_key=CAMYHyAE&hl=en. In the section about "Submitted URL not found (404)", there are a number of add-on pages that currently return a 404:

https://addons.mozilla.org/pl/firefox/addon/clipshare/versions/ https://addons.mozilla.org/es/firefox/addon/yudu-app/ https://addons.mozilla.org/ru/firefox/addon/lunate-dark-mode/reviews/ https://addons.mozilla.org/de/firefox/addon/shirtless-marilyn-manson/ https://addons.mozilla.org/fr/firefox/addon/unofficial-zeldathon-companion/ https://addons.mozilla.org/ja/firefox/addon/hbtc-wallet/versions/ https://addons.mozilla.org/de/firefox/addon/ruhr-uni/reviews/ https://addons.mozilla.org/fr/firefox/addon/youtube-playlist/

Perhaps these are all add-ons that were valid when the sitemap was built and have since been removed/hidden, but we should probably investigate to make sure that is the case.

┆Issue is synchronized with this Jira Task

eviljeff commented 3 years ago

I don't have site permissions to investigate these addons statuses via devhub/reviewer tools so I'm just guessing based on what I can glean from redash.

https://addons.mozilla.org/es/firefox/addon/yudu-app/

This looks like an add-on that has a public status but the file isn't approved. Would need further investigation as to why it happened, but it's not a valid state.

https://addons.mozilla.org/ru/firefox/addon/lunate-dark-mode/reviews/ https://addons.mozilla.org/de/firefox/addon/shirtless-marilyn-manson/ https://addons.mozilla.org/fr/firefox/addon/unofficial-zeldathon-companion/ https://addons.mozilla.org/ja/firefox/addon/hbtc-wallet/versions/

The slugs don't exist any longer in the database (renamed or addon deleted) so can't say.

https://addons.mozilla.org/de/firefox/addon/ruhr-uni/reviews/ https://addons.mozilla.org/fr/firefox/addon/youtube-playlist/ https://addons.mozilla.org/pl/firefox/addon/clipshare/versions/

The add-ons are inactive now so I guess won't be in the current sitemap files. (the query uses Addon.objects.public() which does Q(_current_version__isnull=False, disabled_by_user=False, status__in=(STATUS_APPROVED,). If anyone knows of any web based tools to check if a url is in a set of sitemap xml files, that'd be very useful to confirm they really aren't in the files - manually loading and searching through 1000s of xml files isn't practical.

I took a look at the current coverage report in google search console and found similar issues. Though I don't quite understand how urls such as https://addons.mozilla.org/en-GB/firefox/themes/category/nature/?page=1981 are there when we limited the search pagination urls to 1000 in https://github.com/mozilla/addons/issues/8380 (and I did manually check that in https://addons.mozilla.org/sitemap.xml?section=categories&app_name=firefox&p=59)

Not sure if there are any next steps here.

KevinMind commented 6 months ago

Old Jira Ticket: https://mozilla-hub.atlassian.net/browse/ADDSRV-53