DOI-ONRR / onrr.gov-site

We will use this repo to manage our work on the onrr.gov website
11 stars 3 forks source link

Investigate search results showing up old urls #2372

Closed Maroyafaied closed 1 year ago

eelzi-ONRR commented 1 year ago

I've dug into the Bing Webmaster Tools, and it shows that all of our new URLs are discovered by Bing, but "The inspected URL is known to Bing but has some issues which are preventing indexation."

The Webmaster dashboards don't provide any details of what the issues are. There is a Webmaster tool that allows you to submit a URL one by one to request that Bing indexes it - I've summited a handful to see if that works and if they are able to provide us with details we can use to fix the rest. I've also opened a support ticket with Bing.

Maroyafaied commented 1 year ago

@eelzi-ONRR I took the same approach and submitted one url to be indexed/scanned and it was the main url for the website and this is the report with errors that I got. Just FYI incase it's helpful. image

Maroyafaied commented 1 year ago

@eelzi-ONRR I also submitted this url and it seems to be pulling up in search results now. So maybe we have to compile a list of all the urls and submit them. We probably have most of the new urls listed within the redirect excel sheets. image image

eelzi-ONRR commented 1 year ago

@Maroyafaied Ah great! That was my thought - to see if the ones I submitted today get indexed and if so then submit the rest. The site map lists the URLS, so I can take them from there.

I'm come across a lot of discussion boards that indicate this is pretty common - Bing essentially blocking sites from indexing without explanation. Once people are able to get Bing support to respond, then they lift the block, but people have said that can take months. So looks like submitting the URLs one by one is a quicker method! We only have 150ish or so URLs, so it won't be too bad to do them all

Maroyafaied commented 1 year ago

@eelzi-ONRR I just tested it out and you can submit more than one url at a time, they just have to be in their own line but copying and pasting from a spreadsheet will put them in their own line. The 159 urls in the sitemap don't include links to documents and such, so we'll probably have to submit those as well which will be quite a lot. OR we can submit the 159 main page urls and the rest can be done when Bing lifts the block. I'd suggest looking at the redirect spreadsheet and deciding if it's easier to copy urls from there since I believe it has the main page urls as well. I'll leave it up to you what you decide but at the very least, we should submit the 159 main pages urls in the sitemap.

image

eelzi-ONRR commented 1 year ago

I heard back from Bing Support. They said that even though pages show as having been crawled and indexed, it takes 3-4 weeks after the site map submission for the results to actually be indexed and show in results. They also said that the method we proposed is a good one and submitting the individual URLs will get pages showing up in results faster.

I used the sitemap and redirects list to make a list of 259 URLs taht have been submitted to Bing.

In looking more at Google Search, for many of the 181 indexed URLS, Google has selected the old URL as the canonical one.

proof of canonical issues.png

Their suggested way of fixing this is to submit a sitemap that lists all the URLs for the new site, not just the 150ish that are in the current sitemap, so that the variations in URLs for tabs and documents are included. I've made a new sitemap that we can discuss and submit next week. I also think we should consider blocking some of the URLs that are being indexed, such as the beta URLs.

Maroyafaied commented 1 year ago

We updated the sitemap and will re-submit to google and bing. @eelzi-ONRR also researched and suggested creating a robot.txt file to optimize the site crawling and indexing, I created an issue for that next sprint.