magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.56k stars 9.32k forks source link

New issues with product url's in sitemap.xml generation after PR 23129 got merged. #23363

Closed hostep closed 3 years ago

hostep commented 5 years ago

Preconditions (*)

  1. Magento 2.3-develop branch after https://github.com/magento/magento2/pull/23129 got merged, I've used commit 51773443eae to test this again

Steps to reproduce (*)

  1. Create a category with url key 'category'
  2. Create a product with url key 'product' and assign it to the category you created in step 1
  3. Go to Stores > Configuration > Catalog > Catalog > Search Engine Optimization

First problem:

  1. Make sure the following settings are setup:
    • Use Categories Path for Product URLs: Yes
    • Generate "category/product" URL Rewrites: Yes
    • Use Canonical Link Meta Tag For Products: Yes
  2. Flush caches
  3. In Marketing > SEO & Search > Site Map, create a sitemap and look at the generated file.

Second problem:

  1. Make sure the following settings are setup:
    • Use Categories Path for Product URLs: Yes
    • Generate "category/product" URL Rewrites: No
    • Use Canonical Link Meta Tag For Products: No
  2. Flush caches
  3. In Marketing > SEO & Search > Site Map, create a sitemap and look at the generated file.

Expected result (*)

First problem:

  1. A product link: https://{domain}/product.html

Second problem:

  1. A product link: https://{domain}/category/product.html

Actual result (*)

First problem:

  1. A product link: https://{domain}/category/product.html

Second problem:

  1. A product link: https://{domain}/product.html

Discussion

This is a follow up ticket from my comments in https://github.com/magento/magento2/pull/23129 after it got (incorrectly?) approved and merged

Please first verify if my claims are correct before starting to work on this. I'm not an SEO expert, I just base this on how I assume search engines work when indexing a shop.

First problem

As far as I know, the sitemap.xml files' only purpose is for search engines to use as they can then more easily find all important links to index a webshop. In this case, where Use Canonical Link Meta Tag For Products is set to Yes, we already indicate on product detail pages that search engines have to use the canonical url (which doesn't contain the category path) using a <meta> tag. But after https://github.com/magento/magento2/pull/23129 got merged, the sitemap files contain product url's with the category path and not the canonical url's. So search engines will first find the non-canonical url, then look at the source code of that page, then find the canonical url, and then visit that url and remove the first page they found in the sitemap file from their index. This means at least 2 requests are made for every product on your shop from search engines which might cause more traffic/load then needed.

Second problem

Magento added a new option Generate "category/product" URL Rewrites (via MC-4244) which is enabled by default, but when you disable it, Magento will no longer store category paths for product url's in the url_rewrite table in the database, but still allows you to see url's with the category path on the frontend as those url's are then build dynamically. Here it is expected that the sitemap files include the links being seen on the frontend even if they don't exist in the url_rewrite table, but the sitemap now uses the canonical links instead of the full url including the category path. This was what https://github.com/magento/magento2/pull/23129 was suppose to fix, but this new option was not taken into consideration.

m2-assistant[bot] commented 5 years ago

Hi @hostep. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.3-develop instance - upcoming 2.3.x release

For more details, please, review the Magento Contributor Assistant documentation.

@hostep do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?


m2-assistant[bot] commented 5 years ago

Hi @shikhamis11. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

piotrekkaminski commented 5 years ago

hey @hostep do you have any supporting docs or more people that have the same point of view on this one? I'm a little bit confused what is the right behavior in this case and we are somewhat reluctant to fiddle with SEO related stuff.

hostep commented 5 years ago

Hi @piotrekkaminski

The following is only for the First problem mentioned above.

As usual with SEO related stuff, if you start searching the web, you'll find all kind of opinions, one says it should be this way, then another says the complete opposite and then yet another one says it doesn't really matter at all. And then you always have the problem of dealing with outdated information, since search engines can change their algorithms from time to time.

As for some links which might plead for the case I'm defending:

But some extra input from other people who know a lot more about actual up-to-date SEO things then me are very welcome to give extra input here 🙂

As for:

I'm a little bit confused what is the right behavior in this case and we are somewhat reluctant to fiddle with SEO related stuff.

Well, PR https://github.com/magento/magento2/pull/23129 got accepted without anyone but me complaining about possible SEO problems being introduced 😉

m2-assistant[bot] commented 5 years ago

Hi @engcom-Delta. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

engcom-Delta commented 5 years ago

Confirmation issue is based on devdocs Canonical Meta Tag article and related to First problem from main description Testing scenario:

#23363issue_config

:heavy_check_mark: Expected result: A product link: https://{domain}/myproduct.html

If you also include the category path in product URLs, the canonical URL remains domain-name/product-url-key. However, the product can also be accessed using its full URL, which includes the category. For example, if the product URL key is driven-backpack, and is assigned to the Gear > Bags category, the product can be accessed using either URL.

:x: Actual result: A product link: https://{domain}/mycategory/myproduct.html

#23363issue _result

magento-engcom-team commented 5 years ago

:white_check_mark: Confirmed by @engcom-Delta Thank you for verifying the issue. Based on the provided information internal tickets MC-22204 were created

Issue Available: @engcom-Delta, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

engcom-Delta commented 5 years ago

In addition: Not sure about Second problem from main description as Generate "category/product" URL Rewrites: No remove category/product URL rewrites and left separate url keys for categories and products

Turning off automatic generation of category/products URL rewrites results in permanent removal of all existing category/product type URL rewrites, which cannot be restored. This could potentially cause unresolved category/product type URL conflicts that will require a manual update of the URL key to resolve. https://docs.magento.com/m2/ce/user_guide/marketing/url-redirect-product-automatic.html https://docs.magento.com/m2/ce/user_guide/catalog/catalog-urls.html

For this reason Help Wanted label is added

hostep commented 5 years ago

@engcom-Delta, thanks for testing!

As for the second problem, have you tested that as well? Those url's with category paths in them are still working with that setting disabled. They just aren't getting stored in the url_rewrite table, they are calculated on the fly on the frontend.

m-vd-meer commented 4 years ago

Hi. I'm facing same issues with 2.2. Any idea when this could be fixed?

gabrieldagama commented 3 years ago

Hi @hostep. Thank you for your report. The issue has been fixed in magento/magento2#29184 by @AntonEvers in 2.4-develop branch Related commit(s):

The fix will be available with the upcoming 2.4.3 release.

hostep commented 3 years ago

@gabrieldagama: only the first problem outlined in the description above is fixed by #29184, not the second problem

Admittedly, I shouldn't have created an issue describing 2 different problems, but I did because they were very closely related, sorry for that.

What's the best way forward? Opening a new issue describing problem 2, or re-opening this one and updating my opening post that problem 1 is already fixed and we only need a fix for problem 2?