reverbdotcom / reverb-magento

Magento 1.x plugin for syncing with Reverb
Other
7 stars 10 forks source link

Migration to backfill category UUIDs #223

Closed skwp closed 8 years ago

skwp commented 8 years ago

Reverb is changing its category system to be UUID based. This will enable us to change category names and slugs without affecting customer category mappings. The API will update as follows:

Before: product_type_slug:"foo" categories:["something","something-else"]

After: category_uuids: ["1234-asdf-1234-1324"]

In order to support this change, we'll need to do a migration to attach a UUID to every category that's in the database. A CSV file will be supplied that will look like this:

original_product_type_slug, original_category_slug, uuid electric-guitars, 12-string, 1234-5678-1234-5678

This file should become the source of truth for magento categories. Therefore:

  1. Change code that ties magento categories to reverb categories in the database to do it by UUID to tie them to the text file.
  2. Provide a migration so that the mappings already in place move to point to the text file by looking up each mapping via product_type_slug and category_slug in the file and adding the UUID to the database.
dunagan5887 commented 8 years ago

Currently the following database schemas exist for the categories:

Table reverb_categories:
'reverb_category_id', 'int(10) unsigned', 'NO', 'PRI', NULL, 'auto_increment'
'name', 'text', 'NO', '', NULL, ''
'description', 'text', 'NO', '', NULL, ''
'reverb_product_type_slug', 'varchar(50)', 'YES', 'MUL', NULL, ''
'reverb_category_slug', 'varchar(50)', 'YES', 'MUL', NULL, ''
Table reverb_magento_categories:
'xref_id', 'int(10) unsigned', 'NO', 'PRI', NULL, 'auto_increment'
'magento_category_id', 'int(10) unsigned', 'NO', 'UNI', NULL, ''
'reverb_category_id', 'int(10) unsigned', 'NO', 'MUL', NULL, ''  References reverb_categories.reverb_category_id

It seems to me that all we would need to do is add a uuid column to table reverb_categories. When making the listing API call to Reverb, we would just send the category_uuids field to Reverb instead of the slug fields. Let me know if this sounds right to you

skwp commented 8 years ago

Yes that sounds correct. The backfill would have to simply match on the category and product type slug and assign a UUID. If there are any categories in the backfill that are already not in the db, we should create records for them.

skwp commented 8 years ago

I will attach the file here as soon as we have it

skwp commented 8 years ago

Actually there is one further thing: instead of reading from the database at all, would it be possible to have the front end with the mapping read directly from the text file? I want to ship the text file with the codebase and that be the source of truth, instead of the database table. This would make it easier to ship new category changes without dealing with database migrations in the future.

dunagan5887 commented 8 years ago

Technically that would be possible, but the performance would be worse. It would also look questionable when developers are doing code reviews of the extension.

zztimur commented 8 years ago

Yan,

When exactly is the switch? will the legacy API be available for some time?

We can add a setting in the reverb config for legacy API.

Thanks!

On Mar 8, 2016, at 1:08 PM, Sean Dunagan notifications@github.com wrote:

Technically that would be possible, but the performance would be worse. It would also look questionable when developers are doing code reviews of the extension.

— Reply to this email directly or view it on GitHub https://github.com/reverbdotcom/reverb-magento/issues/223#issuecomment-193920523.

skwp commented 8 years ago

Category UUID migration (integration side)

skwp commented 8 years ago

We are allowing UUIDs immediately but won't deprecate the legacy API for a long time. The only thing is we will be renaming some categories which might cause some breakage for those particular subcategories; or we will provide some legacy mapping on our end.

We don't need a setting, we should just migrate the code going forward in the next version

Here is the file with the mappings: https://s3.amazonaws.com/uploads.hipchat.com/28943/2968040/h2NTWVw2nc7LbBR/flatten.csv

skwp commented 8 years ago

you can ignore the last "relationship" column there, that's for the hierarchical structure

skwp commented 8 years ago

Here's the new file with all the correct info. Had to zip it so that github would take it

categories.csv.zip

cc @dunagan5887

dunagan5887 commented 8 years ago

Yan,

Could you provide a version of the file with field enclosures? There are some rows which aren't being csv parsed correctly because there are commas in the description. If not I can go through and set them by hand.

@skwp

skwp commented 8 years ago

yep i'll fix that

skwp commented 8 years ago

categories.csv.zip

dunagan5887 commented 8 years ago

@skwp Currently if a product is mapped to multiple Reverb categories, the system looks for a subcategory; it determines this by whether the category has a reverb_product_type_slug. Moving forward, are Reverb categories still going to have reverb_product_type_slug values assigned and returned from the API?

skwp commented 8 years ago

Good question. I think we would want to label subcategories in that file. Probably the best way would be to give the id of the parent. That way anything without a parent is the root and a subcategory should always be preferred. Does that make sense?

Sent from my iPhone

On Mar 21, 2016, at 6:20 PM, Sean Dunagan notifications@github.com wrote:

@skwp Currently if a product is mapped to multiple Reverb categories, the system looks for a subcategory; it determines this by whether the category has a reverb_product_type_slug. Moving forward, are Reverb categories still going to have reverb_product_type_slug values assigned and returned from the API?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

skwp commented 8 years ago

Here's the new file with a parent_uuid field. This way if something doesn't have a parent_uuid we know it's a root category categories_with_parents.csv.zip

dunagan5887 commented 8 years ago

That imported cleanly. In terms of finding the "deepest" Reverb category, should I traverse parent uuid's to determine if a Reverb category is 2nd or 3rd level? Or should I keep it simple and just choose an arbitrary subcategory?

dunagan5887 commented 8 years ago

@skwp Just to confirm, if a category's slugs are not listed in the incoming uuid file, then that category should be assumed to no longer be valid and should be deleted from the database, correct?

dunagan5887 commented 8 years ago

I'm also going to drop the legacy category xref table

skwp commented 8 years ago

Yep correct

Sent from my iPhone

On Mar 22, 2016, at 12:46 PM, Sean Dunagan notifications@github.com wrote:

I'm also going to drop the legacy category xref table

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub

dunagan5887 commented 8 years ago

As of commit fe044dfd7f3267b98ed5c9a52465772fcdf72823 the category mapping is tested and ready to go in Magento CE 1.9. Going to test now in 1.7

dunagan5887 commented 8 years ago

Just tested on Magento CE 1.7

dunagan5887 commented 8 years ago

Currentl waiting for the Reverb API

skwp commented 8 years ago

@dunagan5887 is it simpler if you have a json api for the same data? i'm having a little trouble with the csv version right now but I'm sure we can figure that out

dunagan5887 commented 8 years ago

@skwp I can make either work, json will take some additional work since the CSV parsing functionality is already built out. Shouldn't be much more than a few hours to code and test

skwp commented 8 years ago

@dunagan5887 we decided we're not going to do the CSV endpont; sorry we'll have throw away the csv parsing stuff. The rest of the logic remains the same. Here's the flat endpoint with the same data as the csv, but in json: https://reverb.com/api/categories/flat