Closed hostep closed 3 years ago
Hi @hostep. Thank you for your report. To help us process this issue please make sure that you provided the following information:
Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:
@magento give me 2.3-develop instance
- upcoming 2.3.x release
For more details, please, review the Magento Contributor Assistant documentation.
@hostep do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?
Hi @shikhamis11. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:
[ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).Details
If the issue has a valid description, the label Issue: Format is valid
will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid
appears.
[ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description
label to the issue by yourself.
[ ] 3. Add Component: XXXXX
label(s) to the ticket, indicating the components it may be related to.
[ ] 4. Verify that the issue is reproducible on 2.3-develop
branchDetails
- Add the comment @magento give me 2.3-develop instance
to deploy test instance on Magento infrastructure.
- If the issue is reproducible on 2.3-develop
branch, please, add the label Reproduced on 2.3.x
.
- If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
[ ] 5. Verify that the issue is reproducible on 2.2-develop
branch. Details
- Add the comment @magento give me 2.2-develop instance
to deploy test instance on Magento infrastructure.
- If the issue is reproducible on 2.2-develop
branch, please add the label Reproduced on 2.2.x
[ ] 6. Add label Issue: Confirmed
once verification is complete.
[ ] 7. Make sure that automatic system confirms that report has been added to the backlog.
hey @hostep do you have any supporting docs or more people that have the same point of view on this one? I'm a little bit confused what is the right behavior in this case and we are somewhat reluctant to fiddle with SEO related stuff.
Hi @piotrekkaminski
The following is only for the First problem mentioned above.
As usual with SEO related stuff, if you start searching the web, you'll find all kind of opinions, one says it should be this way, then another says the complete opposite and then yet another one says it doesn't really matter at all. And then you always have the problem of dealing with outdated information, since search engines can change their algorithms from time to time.
As for some links which might plead for the case I'm defending:
[W]e use information from URLs included in Sitemaps files for these main purposes:
- Recognizing preferred URLs for canonicalization
Determine your preferred URLs. Before fixing duplicate content issues, you'll have to determine your preferred URL structure. Which URL would you prefer to use for that piece of content? Be consistent within your website. Once you've chosen your preferred URLs, make sure to use them in all possible locations within your website (including in your Sitemap file).
And if the sitemap file says one URL and it redirects to a different URL then you you're giving us kind of conflicting information. You're saying well this is the one I want to have shown in the second file and that URL itself says actually I want you to choose this other URL instead. So we're in the situation like do we trust a sitemap file do we trust a redirect target. Is there maybe a rel canonical that's even different? What is the internal linking like? Does that match any one of these or maybe even a third. And all of that makes it a little bit trickier for us to take the canonical to show in the search results.
Use a sitemap Pick a canonical URL for each of your pages and submit them in a sitemap. All pages listed in a sitemap are suggested as canonicals; Googlebot will decide which pages (if any) pages are duplicates, based on similarity of content.
We don't guarantee that we'll consider the sitemap URLs to be canonical, but it is a simple way of defining canonicals for a large site, and sitemaps are a useful way to tell Google which pages you consider most important on your site.
Don't include non-canonical pages in a sitemap. If using a sitemap, specify only canonical URLs in the sitemap.
But some extra input from other people who know a lot more about actual up-to-date SEO things then me are very welcome to give extra input here 🙂
As for:
I'm a little bit confused what is the right behavior in this case and we are somewhat reluctant to fiddle with SEO related stuff.
Well, PR https://github.com/magento/magento2/pull/23129 got accepted without anyone but me complaining about possible SEO problems being introduced 😉
Hi @engcom-Delta. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:
[ ] 1. Verify that issue has all the required information. (Preconditions, Steps to reproduce, Expected result, Actual result).Details
If the issue has a valid description, the label Issue: Format is valid
will be added to the issue automatically. Please, edit issue description if needed, until label Issue: Format is valid
appears.
[ ] 2. Verify that issue has a meaningful description and provides enough information to reproduce the issue. If the report is valid, add Issue: Clear Description
label to the issue by yourself.
[ ] 3. Add Component: XXXXX
label(s) to the ticket, indicating the components it may be related to.
[ ] 4. Verify that the issue is reproducible on 2.3-develop
branchDetails
- Add the comment @magento give me 2.3-develop instance
to deploy test instance on Magento infrastructure.
- If the issue is reproducible on 2.3-develop
branch, please, add the label Reproduced on 2.3.x
.
- If the issue is not reproducible, add your comment that issue is not reproducible and close the issue and stop verification process here!
[ ] 5. Add label Issue: Confirmed
once verification is complete.
[ ] 6. Make sure that automatic system confirms that report has been added to the backlog.
Confirmation issue is based on devdocs Canonical Meta Tag article and related to First problem from main description Testing scenario:
mycategory
myproduct
and assign it to the category you created indexer:reindex
cache:flush
:heavy_check_mark: Expected result: A product link: https://{domain}/myproduct.html
If you also include the category path in product URLs, the canonical URL remains domain-name/product-url-key. However, the product can also be accessed using its full URL, which includes the category. For example, if the product URL key is driven-backpack, and is assigned to the Gear > Bags category, the product can be accessed using either URL.
:x: Actual result: A product link: https://{domain}/mycategory/myproduct.html
:white_check_mark: Confirmed by @engcom-Delta
Thank you for verifying the issue. Based on the provided information internal tickets MC-22204
were created
Issue Available: @engcom-Delta, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.
In addition:
Not sure about Second problem from main description as Generate "category/product" URL Rewrites: No
remove category/product URL rewrites and left separate url keys for categories and products
Turning off automatic generation of category/products URL rewrites results in permanent removal of all existing category/product type URL rewrites, which cannot be restored. This could potentially cause unresolved category/product type URL conflicts that will require a manual update of the URL key to resolve. https://docs.magento.com/m2/ce/user_guide/marketing/url-redirect-product-automatic.html https://docs.magento.com/m2/ce/user_guide/catalog/catalog-urls.html
For this reason Help Wanted label is added
@engcom-Delta, thanks for testing!
As for the second problem, have you tested that as well? Those url's with category paths in them are still working with that setting disabled. They just aren't getting stored in the url_rewrite
table, they are calculated on the fly on the frontend.
Hi. I'm facing same issues with 2.2. Any idea when this could be fixed?
Hi @hostep. Thank you for your report. The issue has been fixed in magento/magento2#29184 by @AntonEvers in 2.4-develop branch Related commit(s):
The fix will be available with the upcoming 2.4.3 release.
@gabrieldagama: only the first problem outlined in the description above is fixed by #29184, not the second problem
Admittedly, I shouldn't have created an issue describing 2 different problems, but I did because they were very closely related, sorry for that.
What's the best way forward? Opening a new issue describing problem 2, or re-opening this one and updating my opening post that problem 1 is already fixed and we only need a fix for problem 2?
Preconditions (*)
2.3-develop
branch after https://github.com/magento/magento2/pull/23129 got merged, I've used commit 51773443eae to test this againSteps to reproduce (*)
First problem:
Second problem:
Expected result (*)
First problem:
https://{domain}/product.html
Second problem:
https://{domain}/category/product.html
Actual result (*)
First problem:
https://{domain}/category/product.html
Second problem:
https://{domain}/product.html
Discussion
This is a follow up ticket from my comments in https://github.com/magento/magento2/pull/23129 after it got (incorrectly?) approved and merged
Please first verify if my claims are correct before starting to work on this. I'm not an SEO expert, I just base this on how I assume search engines work when indexing a shop.
First problem
As far as I know, the
sitemap.xml
files' only purpose is for search engines to use as they can then more easily find all important links to index a webshop. In this case, whereUse Canonical Link Meta Tag For Products
is set toYes
, we already indicate on product detail pages that search engines have to use the canonical url (which doesn't contain the category path) using a<meta>
tag. But after https://github.com/magento/magento2/pull/23129 got merged, the sitemap files contain product url's with the category path and not the canonical url's. So search engines will first find the non-canonical url, then look at the source code of that page, then find the canonical url, and then visit that url and remove the first page they found in the sitemap file from their index. This means at least 2 requests are made for every product on your shop from search engines which might cause more traffic/load then needed.Second problem
Magento added a new option
Generate "category/product" URL Rewrites
(via MC-4244) which is enabled by default, but when you disable it, Magento will no longer store category paths for product url's in theurl_rewrite
table in the database, but still allows you to see url's with the category path on the frontend as those url's are then build dynamically. Here it is expected that the sitemap files include the links being seen on the frontend even if they don't exist in theurl_rewrite
table, but the sitemap now uses the canonical links instead of the full url including the category path. This was what https://github.com/magento/magento2/pull/23129 was suppose to fix, but this new option was not taken into consideration.