SSWConsulting / SSW.Website

Generator for ssw.com.au
https://www.ssw.com.au
Apache License 2.0
7 stars 5 forks source link

πŸ” SEO - Pages not being indexed #2772

Open jeoffreyfischer opened 6 days ago

jeoffreyfischer commented 6 days ago

Based on the email chain:

From: @wicksipedia Sent: Monday, June 24, 2024 11:20 AM To: @andrewwaltosssw Cc: @sethdaily ; SSW Website v3 SSWWebsiteV3@ssw.com.au; @camillars; Jeoffrey Fischer [SSW] JeoffreyFischer@ssw.com.au Subject: Re: New reasons prevent pages in a sitemap from being indexed on site https://www.ssw.com.au/

Description Search Console has identified that some pages on our website are not being indexed.

URL https://search.google.com/u/1/search-console/index?resource_id=https://www.ssw.com.au/&utm_source=wnc_20237597&utm_medium=gamma&utm_campaign=wnc_20237597&utm_content=msg_110624660&hl=en

Solution Fix the following issues.

Source: Website

Source: Google systems

Acceptance Criteria

Screenshots Image Figure: 76k pages are not indexed

Image Figure: List of reasons why pages aren't indexed. Source: Website

Image Figure: List of reasons why pages aren't indexed. Source: Google systems

Calinator444 commented 3 days ago

All of the new pages that were listed as Alt page with proper canonical tag were browse pages.

Essentially these browse pages list the pages at the base of each route. They don't have any unique content and they'll be removed when the VM gets switched off. I'm not sure why they're being de-indexed but they'll be removed soon anyway so it's not worth trying to re-index them.

I didn't find any other pages in the Alt page with proper canonical tag category that shouldn't be there.

Image

Other pages in this category on our site are:

https://www.ssw.com.au/rules/add-context-reasoning-to-emails (πŸ” user canonical already points to this route, requested re-indexing) https://www.ssw.com.au/ssw/MenuMap.aspx (πŸ”google selected correct canonical URL: https://www.ssw.com.au/ssw/menumap.aspx) https://www.ssw.com.au/SSW/Database/LinksSoftwareUpdates.aspx (πŸ”user declared canonical points to http://www.ssw.com.au/ssw/Database/LinksSoftwareUpdates.aspx, same link but with missing s)

For the following URLs "ssw.com.au" was selected as the canonical URL rather than "www.ssw.com.au"

Profiles

https://www.ssw.com.au/people/zach-keeping/ (πŸ” google selected canonical URL https://ssw.com.au/people/zach-keeping/ ) https://www.ssw.com.au/people/adam-cogan/ (πŸ” user canonical points to https://ssw.com.au/people/adam-cogan/ Which is bad ) ) https://ssw.com.au/people/anastasia-cogan/ (πŸ” user canonical points to https://www.ssw.com.au/people/anastasia-cogan/) https://www.ssw.com.au/people/bob-northwind/ (πŸ” user canonical points to https://ssw.com.au/people/bob-northwind/) https://www.ssw.com.au/people/chris-schultz/ (πŸ” user canonical points to https://ssw.com.au/people/chris-schultz/) https://www.ssw.com.au/people/manu-gulati/ (πŸ” user canonical points to https://ssw.com.au/people/manu-gulati/) https://www.ssw.com.au/people/sam-wagner/ (πŸ” user canonical points to https://ssw.com.au/people/sam-wagner/gulati/) https://www.ssw.com.au/people/tino-liu/ (πŸ” user canonical points to https://ssw.com.au/people/tino-liu/) https://www.ssw.com.au/people/zach-keeping/ (πŸ” user canonical points tohttps://ssw.com.au/people/zach-keeping/)

Other

https://www.ssw.com.au/people/ (πŸ” user canonical points to https://ssw.com.au/people/)

Calinator444 commented 3 days ago

Alternate page with proper canonical tag - 26,094 pages

See comment above

Excluded by β€˜noindex’ tag - 1,666 pages

Most of these are on the v1 site, however I've created a PR for stripping the no index tags on the archived pages that had them.

Blocked by robots.txt - 257 pages

Most of these pages are on the v1 site which is being switched off anyway, Only 4 pages on the live site appear here, and they're all 500/404 pages so need to discuss with @wicksipedia whether they should be indexed. I'm assuming they shouldn't be indexed.

List of pages relevant pages excluded include:

Server error (5xx) - 89 pages

Again, this was mostly noise coming from the v1 site. There's a few links to tina template pages in the static content some of the pages on the site being referenced. When Google tries to crawl these links it, of course, returns a 500. Will need to investigate this on Monday.

Image


Blocked due to other 4xx issue - 1 page

It was trying to index a broken link within a rule. I've fixed the link here: https://github.com/SSWConsulting/SSW.Rules.Content/pull/8813

Blocked due to access forbidden (403) - 1 page

The broken link points to http://ssw.com.au/ssw/TeamCalendar/Installation/, however there's no default page at that route. Google attempted to index the page because other pages on the site point to that URL.

This page was also in the 403 list: https://www.ssw.com.au/ssw/Redirect/Access/AccessTrial.htm. However it's a redirect which will be gone when v1 server goes offline. Adam already agreed to scrap the /Redirect route.