Open handymenny opened 2 years ago
I also notice that in the second result in the screenshot the number of posts shown is wrong. It says 20 but that's not the total number of posts in the discussion.
(2) seems like a reasonable solution. The button already exists, so we'd just need to add a userAgent check as to whether a given user is a bot or not before triggering autoload.
Another solution could be updating the canonical url as the page scrolls, but:
I'll also note that there's a bit more duplication than I'd like, as it seems like search engines have both the old post number-based links and the new page-based links stored. But that should get resolved naturally over time.
I also notice that in the second result in the screenshot the number of posts shown is wrong. It says 20 but that's not the total number of posts in the discussion.
This one has me a bit stumped. I'm not sure where exactly Google is getting this information; it doesn't seem to show up on search results for NodeBB or Discourse communities, and from a few quick Google searches I'm not seeing whether there's a meta tag we could use to provide accurate information on this. @Hari-Bonda @jaspervriends or anyone else knowledgable in SEO, any ideas as to where this is coming from?
Not sure if the SEO extension embeds JSON-LD, but that might be it?
Not sure if the SEO extension embeds JSON-LD, but that might be it?
The extension has an option for listing the posts in JSON-LD but I haven't enabled it for performance reasons (I'm the owner of the forum of the example above). Also, I'm pretty sure that the posts count and the other metadata have been there even before the extension existed, so I guess it's just Google magic.
i am aware of this issue since Aug 2021, we had to disable indexing subpages (posts) until this issue gets resolved so we went with Discussion Canonical URL extension as a temporary solution https://github.com/SychO9/flarum-discussion-canonical-url
Coming to the solution.
When flarum is breaking discussions into pages for example
page=?near1 page=?near2 page=?near3
you should change the page title too (test discussion page 1 of 2) & (test discussion page 2 of 2)
this title was implemented in flarum v1.2 and using linguist I have made a few tweaks to get the page title output what I need
the second most important thing is when the sum of posts are considered as page page=?near2 , for that particular page flarum should maintain the first post (in case of second page 21st post) content as page description.
let us say if the second page is displaying the 20th post as the first post for the second page you should display the 20th post content as a page meta description for that page if you fail to do this you will end up seeing duplicate titles and description for all pages which is a huge SEO mistake google will consider as you are trying to fool the spider with duplicate content and gets confused to rank the main discussion page.
i don't know how JSON or JS is working but maintain a different DB or something to get the data like I have mentioned.
i am not an expert but if you observe WordPress pages you can easily notice this.
Bug Report
Current Behavior
Google's crawler indexes page X of a discussion (discussion/?page=X) using also the text/images from other pages. This means that what Google's crawler sees doesn't always match what users see once they open that page.
Example
Google "ok, ma io ho fatto questi test a scopo informativo/illustrativo site:forum.fibra.click"
Second result is page 12, but that post belongs to page 10
Expected Behavior Google's crawler should only look at posts on a specific page, i.e. in the above example Google should link to page 10
Environment
Possible Solutions