magento / magento2

Prior to making any Submission(s), you must sign an Adobe Contributor License Agreement, available here at: https://opensource.adobe.com/cla.html. All Submissions you make to Adobe Inc. and its affiliates, assigns and subsidiaries (collectively “Adobe”) are subject to the terms of the Adobe Contributor License Agreement.
http://www.magento.com
Open Software License 3.0
11.54k stars 9.32k forks source link

getRewriteByProductStore does not return product URL rewrite #23357

Open daniel-ifrim opened 5 years ago

daniel-ifrim commented 5 years ago

Preconditions (*)

  1. Magento 2.2.7 but I think it's on all version Magento 2.3-develop branch too
  2. MySQL 5.7.26

Steps to reproduce (*)

  1. Not much to test unless you make a code construct like this:
    /* @var \Magento\Catalog\Model\ResourceModel\Url $this->catalogUrl */
    $this->catalogUrl->getRewriteByProductStore([$productId => $storeId])

Expected result (*)

  1. It should return an array with product slug like this:
    [4256] => Array
        (
            [store_id] => 9999
            [visibility] => 4
            [url_rewrite] => lorem-ipsum.html
        )

    Actual result (*)

  2. It returns product category URL in some cases:
    [4256] => Array
        (
            [store_id] => 9999
            [visibility] => 4
            [url_rewrite] => test-category/lorem-ipsum.html
        )

It happens because the SQL query from \Magento\Catalog\Model\ResourceModel\Url::getRewriteByProductStore doesn't look good.

This is wrong:

)->joinLeft(
    ['r' => $this->getTable('catalog_url_rewrite_product_category')],
    'u.url_rewrite_id = r.url_rewrite_id AND r.category_id is NULL',
    []
);

To get rows from table url_rewrite not in left joined table catalog_url_rewrite_product_category we should remove AND r.category_id is NULL from left join and add it in where as:

AND r.url_rewrite_id IS NULL

where r is the alias of table url_rewrite.

Leaving AND r.category_id is NULL' in join condition will create a result with more than one row. I get all rows for that specific product including rows with product category request_path.

Further down in function getRewriteByProductStore it adds to return variable the last row for a product:

$rowSet = $connection->fetchAll($select, $bind);
foreach ($rowSet as $row) {
    $result[$row['product_id']] = [
        'store_id' => $row['store_id'],
        'visibility' => $row['visibility'],
        'url_rewrite' => $row['request_path'],
    ];
}

In my case the last row is product category request_path and is not product request_path. The product request_path is in the first row from the result set ($rowSet).

Query done in getRewriteByProductStore

SELECT `i`.`product_id`, 
       `i`.`store_id`, 
       `i`.`visibility`, 
       `u`.`request_path` 
FROM   `catalog_category_product_index_store22` AS `i` 
       LEFT JOIN `url_rewrite` AS `u` 
              ON i.product_id = u.entity_id 
                 AND i.store_id = u.store_id 
                 AND u.entity_type = "product" 
       LEFT JOIN `catalog_url_rewrite_product_category` AS `r` 
              ON u.url_rewrite_id = r.url_rewrite_id 
                 AND r.category_id IS NULL 
WHERE  (( i.product_id = 4256 
          AND i.store_id = 22 
          AND i.category_id = 2 ));

It returns 3 rows.

Query modied:

SELECT `i`.`product_id`, 
       `i`.`store_id`, 
       `i`.`visibility`, 
       `u`.`request_path` 
FROM   `catalog_category_product_index_store22` AS `i` 
       LEFT JOIN `url_rewrite` AS `u` 
              ON i.product_id = u.entity_id 
                 AND i.store_id = u.store_id 
                 AND u.entity_type = "product" 
       LEFT JOIN `catalog_url_rewrite_product_category` AS `r` 
              ON u.url_rewrite_id = r.url_rewrite_id
WHERE  (( i.product_id = 4256 
          AND i.store_id = 22 
          AND i.category_id = 2 )
          AND r.url_rewrite_id IS NULL);

It returns 1 row with product request_path.

UPDATE The SQL query should include u.redirect_type=0 also because otherwise it can take row with redirect_type=301 (permanent redirects).

This code

            $select = $connection->select()->from(
                ['i' => $this->tableMaintainer->getMainTable($storeId)],
                ['product_id', 'store_id', 'visibility']
            )->joinLeft(
                ['u' => $this->getMainTable()],
                'i.product_id = u.entity_id AND i.store_id = u.store_id'
                . ' AND u.entity_type = "' . ProductUrlRewriteGenerator::ENTITY_TYPE . '"',
                ['request_path']
            )->joinLeft(
                ['r' => $this->getTable('catalog_url_rewrite_product_category')],
                'u.url_rewrite_id = r.url_rewrite_id AND r.category_id is NULL',
                []
            );

            $bind = [];
            foreach ($productIds as $productId) {
                $catId = $this->_storeManager->getStore($storeId)->getRootCategoryId();
                $productBind = 'product_id' . $productId;
                $storeBind = 'store_id' . $storeId;
                $catBind = 'category_id' . $catId;
                $bindArray = [
                    'i.product_id = :' . $productBind,
                    'i.store_id = :' . $storeBind,
                    'i.category_id = :' . $catBind
                ];
                $cond = '(' . implode(' AND ', $bindArray) . ')';
                $bind[$productBind] = $productId;
                $bind[$storeBind] = $storeId;
                $bind[$catBind] = $catId;
                $select->orWhere($cond);
            }

            $rowSet = $connection->fetchAll($select, $bind);
            foreach ($rowSet as $row) {
                $result[$row['product_id']] = [
                    'store_id' => $row['store_id'],
                    'visibility' => $row['visibility'],
                    'url_rewrite' => $row['request_path'],
                ];
            }

should be something like:

            $select = $connection->select()->from(
                ['i' => $this->tableMaintainer->getMainTable($storeId)],
                ['product_id', 'store_id', 'visibility']
            )->joinLeft(
                ['u' => $this->getMainTable()],
                'i.product_id = u.entity_id AND i.store_id = u.store_id'
                . ' AND u.entity_type = "' . ProductUrlRewriteGenerator::ENTITY_TYPE . '"',
                ['request_path']
            )->joinLeft(
                ['r' => $this->getTable('catalog_url_rewrite_product_category')],
                'u.url_rewrite_id = r.url_rewrite_id',
                []
            );

            $bind = [];
            foreach ($productIds as $productId) {
                $catId = $this->_storeManager->getStore($storeId)->getRootCategoryId();
                $productBind = 'product_id' . $productId;
                $storeBind = 'store_id' . $storeId;
                $catBind = 'category_id' . $catId;
                $bindArray = [
                    'i.product_id = :' . $productBind,
                    'i.store_id = :' . $storeBind,
                    'i.category_id = :' . $catBind
                ];
                $cond = '(' . implode(' AND ', $bindArray) . ' AND (r.url_rewrite_id IS NULL)) AND u.redirect_type = 0';
                $bind[$productBind] = $productId;
                $bind[$storeBind] = $storeId;
                $bind[$catBind] = $catId;
                $select->orWhere($cond);
            }

            $rowSet = $connection->fetchAll($select, $bind);
            foreach ($rowSet as $row) {
                $result[$row['product_id']] = [
                    'store_id' => $row['store_id'],
                    'visibility' => $row['visibility'],
                    'url_rewrite' => $row['request_path'],
                ];
            }

Diff (don't copy paste, white-spaces will mess the diff):

Index: ../vendor/magento/module-catalog/Model/ResourceModel/Url.php
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
--- ../vendor/magento/module-catalog/Model/ResourceModel/Url.php    (date 1567697855000)
+++ ../vendor/magento/module-catalog/Model/ResourceModel/Url.php    (date 1567697855000)
@@ -680,7 +680,7 @@
                 ['request_path']
             )->joinLeft(
                 ['r' => $this->getTable('catalog_url_rewrite_product_category')],
-                'u.url_rewrite_id = r.url_rewrite_id AND r.category_id is NULL',
+                'u.url_rewrite_id = r.url_rewrite_id',
                 []
             );

@@ -695,7 +695,7 @@
                     'i.store_id = :' . $storeBind,
                     'i.category_id = :' . $catBind
                 ];
-                $cond = '(' . implode(' AND ', $bindArray) . ')';
+                $cond = '(' . implode(' AND ', $bindArray) . ' AND (r.url_rewrite_id IS NULL)) AND u.redirect_type = 0';
                 $bind[$productBind] = $productId;
                 $bind[$storeBind] = $storeId;
                 $bind[$catBind] = $catId;
m2-assistant[bot] commented 5 years ago

Hi @daniel-ifrim. Thank you for your report. To help us process this issue please make sure that you provided the following information:

Please make sure that the issue is reproducible on the vanilla Magento instance following Steps to reproduce. To deploy vanilla Magento instance on our environment, please, add a comment to the issue:

@magento give me 2.3-develop instance - upcoming 2.3.x release

For more details, please, review the Magento Contributor Assistant documentation.

@daniel-ifrim do you confirm that you were able to reproduce the issue on vanilla Magento instance following steps to reproduce?


daniel-ifrim commented 5 years ago

@magento-engcom-team This is a API bug related to product URLs. Can someone take a look on it ?

hostep commented 5 years ago

@daniel-ifrim: could you edit your initial post and change the actual results, because they are the same as the expected results. Thanks!

Other than that, if I test your queries over here on some random shop I picked, the first query returns 16 rows, and the second query 3 rows, of which 2 rows still contain category parts in the url. That's probably because they are of redirect_type '301' over here:

Screenshot 2019-07-22 at 20 00 02

The way to only get a single row where no category's are in the url is by adding AND u.metadata IS NULL. Does this make sense?

daniel-ifrim commented 5 years ago

@hostep

Thank you

hostep commented 5 years ago

As you can tell by the screenshot I posted above, your solution still returns more than one row. I do agree that it should only return a single row. So, would adding a WHERE statement on the metadata column be the solution here you think?

m2-assistant[bot] commented 5 years ago

Hi @engcom-Delta. Thank you for working on this issue. In order to make sure that issue has enough information and ready for development, please read and check the following instruction: :point_down:

magento-engcom-team commented 5 years ago

:white_check_mark: Confirmed by @engcom-Delta Thank you for verifying the issue. Based on the provided information internal tickets MC-18469 were created

Issue Available: @engcom-Delta, You will be automatically unassigned. Contributors/Maintainers can claim this issue to continue. To reclaim and continue work, reassign the ticket to yourself.

daniel-ifrim commented 5 years ago

@hostep Adding WHERE metadata IS NULL is a light fix. Someone can add data in the future in metadata column. There will be a bug. I just run the 2 SQL queries on a store, MySQL server 5.7.26. I can run same type of SQL queries for test on Mariadb also if you need me to.

These are the results on MySQL. I don't want to enforce this solution. Anyone that will craft a patch for this issue will have to check he's or her's fix.

On left side there is original SQL query. On right side there is the modified SQL query. compare-left-original-right-changed

The original SQL query executed: SQL-original

The modified SQL query executed: SQL-changed

@hostep Are you sure you didn't make a typo when you were testing ?

Thank you

hostep commented 5 years ago

I just have different data in my url_rewrite table, so we would need to find a solution which works with all kinds of data which can exists in that table. I'll see if I can find some steps to reproduce so I can show how that particular data got in that table. But that might take me a couple of days (bit busy at the moment).

daniel-ifrim commented 5 years ago

A basic example of join on table to get the rows in the first table that are not in the second table. This will demonstrate the bug.

mysql> select * from person;
+----+-------+
| id | name  |
+----+-------+
|  1 | John  |
|  2 | Maria |
+----+-------+

mysql> select * from address;
+----+-----------+--------------------+
| id | person_id | street             |
+----+-----------+--------------------+
|  1 |         2 | London Main Street |
+----+-----------+--------------------+

mysql> # FOLLOWING QUERY IS BUGGED LIKE IN vendor/magento/module-catalog/Model/ResourceModel/Url.php
mysql> SELECT p.id as 'person_id', p.name as 'name', a.id as 'address_id', a.street as 'street' FROM person as p
    -> LEFT JOIN address as a
    ->     ON p.id = a.person_id AND a.street IS NULL;
+-----------+-------+------------+--------+
| person_id | name  | address_id | street |
+-----------+-------+------------+--------+
|         1 | John  |       NULL | NULL   |
|         2 | Maria |       NULL | NULL   |
+-----------+-------+------------+--------+

mysql> # This is good. Add the primary key field in WHERE with IS NULL
mysql> # Only John does not have an address
mysql> SELECT p.id as 'person_id', p.name as 'name', a.id as 'address_id', a.street as 'street' FROM person as p
    -> LEFT JOIN address as a
    ->     ON p.id = a.person_id
    -> WHERE a.person_id IS NULL;
+-----------+------+------------+--------+
| person_id | name | address_id | street |
+-----------+------+------------+--------+
|         1 | John |       NULL | NULL   |
+-----------+------+------------+--------+

MySQL dump of the 2 example tables.

/*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
/*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
/*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8 */;
/*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
/*!40103 SET TIME_ZONE='+00:00' */;
/*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
/*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
/*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
/*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;

--
-- Table structure for table `person`
--

DROP TABLE IF EXISTS `person`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `person` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `person`
--

LOCK TABLES `person` WRITE;
/*!40000 ALTER TABLE `person` DISABLE KEYS */;
INSERT INTO `person` VALUES (1,'John'),(2,'Maria');
/*!40000 ALTER TABLE `person` ENABLE KEYS */;
UNLOCK TABLES;

--
-- Table structure for table `address`
--

DROP TABLE IF EXISTS `address`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `address` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `person_id` int(11) NOT NULL,
  `street` varchar(255) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
/*!40101 SET character_set_client = @saved_cs_client */;

--
-- Dumping data for table `address`
--

LOCK TABLES `address` WRITE;
/*!40000 ALTER TABLE `address` DISABLE KEYS */;
INSERT INTO `address` VALUES (1,2,'London Main Street');
/*!40000 ALTER TABLE `address` ENABLE KEYS */;
UNLOCK TABLES;
/*!40103 SET TIME_ZONE=@OLD_TIME_ZONE */;

/*!40101 SET SQL_MODE=@OLD_SQL_MODE */;
/*!40014 SET FOREIGN_KEY_CHECKS=@OLD_FOREIGN_KEY_CHECKS */;
/*!40014 SET UNIQUE_CHECKS=@OLD_UNIQUE_CHECKS */;
/*!40101 SET CHARACTER_SET_CLIENT=@OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=@OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=@OLD_COLLATION_CONNECTION */;
/*!40111 SET SQL_NOTES=@OLD_SQL_NOTES */;

Thank you

CC @hostep

daniel-ifrim commented 5 years ago

@hostep We should add AND u.redirect_type=0 too as a WHERE condition. I updated at the end the description of this task.

More than this maybe SQL query performance should be rechecked.

Thank you