Closed michaeljdietz closed 4 years ago
Hi, @michaeljdietz thanks for creating this issue.
Some time ago when we were starting MSI development we've discussed this issue and possible consequences it could bring. Along with that, we raised a question to the community what business cases could lead to the update of natural identity (Product SKU). And there were no business cases which could appear on a day to day basis. We were given several examples like a company has own pattern for SKU, and then this company acquired by another company which has own pattern so that existing SKUs have to be adjusted. But this is a one-time operation, and there is no sense to introduce service layer for it, rather some low-level tool which could be run via CLI as a part of a migration process.
Thus, in the desirable state of Magento - there should not be possible to modify SKU after the product has been created, this decision is dictated by the business standpoint, but not development one. So, we want to disable SKU modification both from the UI side on the admin panel as well as via ProductRepository API which allows product entity modification.
Moreover even now without MSI, but having ERP integration modifying SKU would lead the system to the data inconsistency, as Product ID is meaningless for the external system (it's totally internal identity which should not be exposed outside of Magento).
Igor,
While I appreciate your detailed response, I have to disagree whole heartedly, and from personal experience no less. This kind of decision seems to have no benefit I can derive from your explanation. Moreover, I have worked at great length in systems that similarly used business layer data for relation and identity purposes, and I can confirm that the decision to do so was ALWAYS fraught with data integrity issues, and almost always these issues would arise from user activities the developer did not or could not anticipate - - which is why it is considered such a bad decision by much of the database schema design community in the first place. At bare minimum it increases the administrative costs on a business as a DBA becomes an absolute necessity regardless of the size of the database in order to maintain controls over how data is manipulated to prevent an integrity nightmare.
I must say, as a dedicated Enterprise/Commerce developer, development decisions like these make me hesitant to continue to dedicate personal and professional resources to the platform and make me hesitant to continue to evangelize its use for business clients and employers.
I have yet to hear a good reason for such a design decision, and would appreciate some kind of explanation beyond your initial response. Why was SKU used instead of product ID given that the rest of the framework joins on product ID and given that surrogate keys are a well established best practice except in cases of extreme storage / performance / record count requirements (of which this is not the case as SKU is significantly more storage / memory hungry than a 4-byte integer)...
Thank you again for your time. The Magento platform is very important to me. It is distressing for me. I don't enjoy feeling that something I have put considerable time into may be headed down a path I consider unsustainable and doomed to fail, and I am not intending to attack you or the rest of the development team in any way.
Sincerely,
A fellow but concerned developer,
Michael Dietz michael.dietz@gmail.com
On Jan 29, 2019 11:26 AM, Igor Miniailo notifications@github.com wrote:
Hi, @michaeljdietzhttps://github.com/michaeljdietz thanks for creating this issue.
Some time ago when we were starting MSI development we've discussed this issue and possible consequences it could bring. Along with that, we raised a question to the community what business cases could lead to the update of natural identity (Product SKU). And there were no business cases which could appear on a day to day basis. We were given several examples like a company has own pattern for SKU, and then this company acquired by another company which has own pattern so that existing SKUs have to be adjusted. But this is a one-time operation, and there is no sense to introduce service layer for it, rather some low-level tool which could be run via CLI as a part of a migration process.
Thus, in the desirable state of Magento - there should not be possible to modify SKU after the product has been created, this decision is dictated by the business standpoint, but not development one. So, we want to disable SKU modification both from the UI side on the admin panel as well as via ProductRepository API which allows product entity modification.
Moreover even now without MSI, but having ERP integration modifying SKU would lead the system to the data inconsistency, as Product ID is meaningless for the external system (it's totally internal identity which should not be exposed outside of Magento).
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/magento-engcom/msi/issues/2001#issuecomment-458606559, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AezvAjmaU7Oz6hzDDUPOqQz3jv3WWg6_ks5vIHYygaJpZM4aWJ86.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast Ltd, an innovator in Software as a Service (SaaS) for business. Providing a safer and more useful place for your human generated data. Specializing in; Security, archiving and compliance. To find out more visit the Mimecast website.
Hi Michael,
Thanks for your response. It's a pleasure to read the feedback of people who care about Magento and trying to prevent it headed down the destructive path.
So, let's postpone the discussion of what's more efficient an integer or sting for MySQL as a primary key, and consider what makes sense for Business, and whether surrogate database generated key Product ID
means anything to a merchant? Domain Driven Design (DDD) teaches us that we have to shape the Ubiquitous language, which all stakeholders should understand and share. First of all, this language will impact Public APIs (Service Contracts), and how the entities from different domains (bounded contexts) communicate with each other.
For a long time, Magento has been a monolithic system with high coupling between internal components. There were both Pros and Cons of such architecture. From one side it's much easier to achieve a better performance because any module could make an integration by means of DB queries, direct MySQL joins, which are fast comparing with making the same integration on the application code level. But along with that, this introduces limitations for scalability and substitutability of components.
Magneto 2 tends to be a scalable system, and there are many steps done towards this way. The latest announcement regarding Components Isolation (https://github.com/antonkril/architecture/blob/fdcfafe1d6187f9abaf18af9eec7c10e8067a637/design-documents/service-isolation.md) is a good example of that.
The more isolated components are - the fewer dependencies they have on the DB schema of each other. Ideally, the DB schema should not be considered as an API at all, but a private encapsulated implementation of particular component instead.
MSI is a pioneering project which has a loose coupling on other parts of Magento core. All the dependencies are based on application service contracts, MSI does not extend existing MySQL queries joining own tables for bringing customization logic.
And if you would consider Magento not as a monolith but as a system which consists of different domains you will see the difference between Catalog and Inventory, and the Product entity could be represented differently on both of these domains. And it could be fine that these systems don't have strong integrity between each other, such as Product does not exist in Catalog, but there is a record for this product in Inventory as it's stored on Warehouse.
Also, natural identities, if they are real identities but not just attributes are non-modifiable. Thus it's hard to provide valid business cases when merchant may need to modify SKU.
Here I recommend to watch a presentation by Riccardo Tempesta about DDD in MSI and naming stuff - https://www.youtube.com/watch?v=_g8PWNcfNug&list=PLfKMcVfy6Vko8LG2OJu5gCJTjP86OiGgD
And also read our discussion happened some time ago about Entity Identities and what's better to use - https://community.magento.com/t5/Magento-DevBlog/Entity-ID-Allocation-Schemes/ba-p/68316
Igor,
Thank you for responding again at great length and taking my concern seriously. After reading your response, watching the YouTube video, and reading the discussion, I still have to disagree. I apologize for not jumping into the conversation earlier. I wasn't aware it was taking place as to be honest I haven't been very engaged in the online Magento community, simply engaged in development using the platform itself. In the future, should I continue to keep Magento as my primary development platform, I may try and become more engaged.
Allow me first to briefly qualify my experience with the non-engineering side of business. I happen to be one of the "lucky" individuals who is both an engineer and a business person. I have led the IT, accounting, payroll, inventory control, service, marketing, and sales departments of a small retail business with around 50 employees (small business = many hats), multiple brick and mortar locations, hundreds of suppliers, tens of thousands of SKUs, and an internet retail business. I have also worked as a software engineer for one of the largest internet retailers in my former industry. Currently I am a software engineer for a large manufacturing conglomerate who sells both direct to consumer and to the largest brick and mortar retailers in the United States. Although I have been a professional software engineer for 15 years, I consider myself an engineer in all aspects of my life, with any manner of system, not just when engineering software systems.
I believe that labeling SKU as the ubiquitous language identifier in this context makes a false assumption. In order to decide upon an identifier, one must first decide upon a context, as the MageConf video you linked also states. That context is dependent upon identification of potential consumers. In this case, the MAIN consumer of the inventory information are not business people or people online buying stuff, but other computer systems (the catalog and checkout modules as well as any frontend modules). These other systems are the KNOWN consumers. We also have potential UNKNOWN consumers that include any outside business computer system, for example an ERP or POS. Why would we decide to use an identifier "potentially" used by some UNKNOWN outside consumers rather than the identifier we know is used and understood by KNOWN consumers if it is to the serious detriment of the system as a whole? Why would we decide to use an identifier that is less performant, more prone to data integrity issues, and less storage friendly? I agree with designing systems so that they can be understood by stakeholders and other non-programmers as best as possible. However, I would counter that this takes the concept too far to the extreme, and at the expense of some things that are REALLY important to the Magento frameworks' primary consumers, the businesses that use the Magento platform.
If I may make a brief example. Domain Driven Design definitely can help us create complex systems with large teams. But what if Honda, who makes the Honda Civic, a popularly consumer-modified vehicle, changed how they made their vehicles so that the PRIMARY focus was "making them easy to modify" despite the fact that the largest portion of their consumer market doesn't modify their vehicles? Imagine that Honda were to modify the Civic design in such a way that to the average consumer the vehicles now cost too much, break down too frequently, and perform significantly worse than before the design change. Although it is nice when they can label parts more neatly, lay them out more accessibly, improve the efficacy of their internal engineering processes, and improve the ability for consumers to modify their vehicles, what is MOST important is that the vehicle still functions, and functions WELL, for the largest market share.
Magento of course strives to be concerned with flexibility to all consumers. It is obvious that the Magento platform cares about the ability of developers to modify and interact with it. It is what draws so many developers to the framework in the first place. But most importantly, it has to be concerned with functionality for its primary consumers. In this specific system, the primary consumer is the other Magento systems at play, and indirectly, the human beings using Magento as either customer or retailer.
I know you stated that you don't see a frequent business use case for changing SKUs. I would argue that statement misses some really common day-to-day events. I have personally watched MULTIPLE companies overhaul their SKU systems year after year. Business strategy can be very fluid. A company one day sells only new products, and then starts selling open box and used products and decides to redo their SKU system to reflect this. A company used to only sell one color of all their products, and then starts selling multiple colors of many, forcing them to overhaul their SKU systems. A company used to only sell WIDGET, but now they sell WIDGET-X, WIDGET-Y, and WIDGET-Z, and WIDGET doesn't accurately and uniquely describe the product anymore (or WIDGET morphs to WIDGETmk1, WIDGETmk2, WIDGETmk3 for version iterations). Sure, businesses typically try and make as few changes to their existing SKUs as possible, but it is inevitable that they will change. Furthermore, businesses often do not have control over their SKUs, as a retailer can be at the mercy of thousands of suppliers if they try and use the same SKUs as their MPNs. For a business like this, if one out of a hundred suppliers changed only one SKU once per year, the retailer would still change hundreds of SKUs per year. That seems pretty day-to-day to me.
Moreover, human beings make mistakes. People create a SKU incorrectly when they don't follow a procedure properly or format the SKU incorrectly. They make a mistake and assign the wrong SKU to the wrong product. Computer systems may have bugs that create or sync SKUs incorrectly. These kind of occurrences are daily in any business of any type. As an example, the business I work for had two completely unrelated batches of SKU changes in just the past month alone - one from a merger, another from a human error.
I am sure there are thousands more real world, every day examples of how and why SKUs change. To require a database admin in order to change a SKU makes Magento UNUSABLE for these businesses.
UNUSABLE. That is a big statement. Unfortunately, that may also be a true statement; especially if this is a new paradigm and not just an isolated mistake. If I were a business owner, I would not select a software system that had the potential for data integrity issues when changing my SKUs, my customer account numbers, my order #'s, my vendor #s, etc... and I wouldn't select a software system that forced me to lose all of my product history and recreate the product from scratch whenever I needed to change my SKUs or required me to hire a developer to handle these issues one-off as they arise. Both as an engineer and as a business person I have worked with systems like this at great length and depth. I will NEVER intentionally work with a system like that again as they are a complete data integrity nightmare. To design new observers to monitor SKU changes and update relevant tables ends up moving Magento back to a strongly coupled system requiring knowledge of which tables depend upon SKU for table relationship. Moreover to require an engineer or DBA to make these kinds of daily changes is a bit like requiring a Honda engineer present in order to drive your car. Imagine a DBA having to update 50 tables that all relate on SKU each time a SKU changes (and 500 tables that with the newly suggested paradigm may all relate on a variety of other business fields - attribute codes, groups, categories, products, customers, customer groups, orders, order lines, the list could get pretty crazy). Sure, I guess specialized tools can be written that search all tables for specific columns containing specific data and replace that data - but that now requires a new level of expertise from the world's DBAs - not to mention that kind of automation can be dangerous if it misidentifies data as being related to the original replacement task when it is not.
Magento was originally founded on the belief in FLEXIBILITY for the end user. To create whatever attributes you needed. To customize the frontend however you needed. It is what draws so many of us to the platform. But the moment this flexibility breaks the product is indeed a moment that requires some serious reflection.
In response to the undesirable side effects listed in the discussion post you linked, I would simply say that, similar to design patterns / development team strategies /architecture complications, all of these are development and engineering challenges, and should not be corrected at the expense of the system's functionality.
I of course hear you on the benefits of domain driven design, and the importance of CQS, encapsulation, ease of creating unit tests, etc... I agree whole heartedly that all of these concepts should drive our development strategies. But the moment our development strategy breaks major user functionality (and most importantly, the ability to trust that your data is accurate), that is the moment we need to stop and re-evaluate our development strategy. More important than making sure our customers understand how our products work, is making sure we still have customers who want to use our products to begin with.
We use SKU's that match our suppliers SKU so it is easy to pick products. We change SKU's on our live site for the following reasons. 1) Change of supplier, we alter SKU's to match the new suppliers SKU. 2) Supplier changes part numbers, this has happened on a few occasions for a bulk of products from that supplier. 3) Typing mistakes when originally added.
Igor, haven't heard from you since my last post. This appears to be a concern some others in the community are concerned about. Is this something you guys are re-deliberating internally?
Thanks again.
Hi @michaeljdietz
Sorry for not answering you for a while. I was very busy with MSI 1.1.0 release preparation. Hopefully, it should happen during this week. As soon as the release is done - I'll get back to this discussion.
FYI a week ago I brought this question to public discussion in Tweeter - https://twitter.com/iminyaylo/status/1091736497273925632 here you can get familiar with results.
Hi @timpea
Great points described! Just several questions to you.
- Change of supplier, we alter SKU's to match the new suppliers SKU.
Thus, it's not a typical day to day operation, and you proceed with it once you have a new supplier, as a part of preparation. Why the tool which will do this transformation and would guarantee data consistency would not work for you?
- Supplier changes part numbers, this has happened on a few occasions for a bulk of products from that supplier.
How do you keep the consistency between the stored data in Magento and data arrived from an external system where supplier changed SKU?
- Typing mistakes when originally added.
This is currently could be overpassed with Cloning product functionality, and fixing SKU for a cloned product.
The main question in the scope of this discussion as for me - whether we should treat SKU exactly the same as other product attributes (and modify it accordingly) - or provide some additional meaning to it, and do not let to change it via Service Layer, leaving the possibility of modification via dedicated tools
Preconditions
Steps to reproduce
Expected result
Actual result
Discussion
Unfortunately, this is what happens when you relate on a business field rather than a surrogate key. The inventory_reservation and inventory_source_item tables use SKU to identify products instead of entity_id/product_id. This causes a MAJOR issue if SKU is changed in the admin interface, as the reservations are no longer associated with the product. Why are we using ANYTHING other than a surrogate key to relate data? A little bit alarming from a development-standpoint to see this happen in the codebase. I have a hard time imagining a business case where it makes sense not to use the surrogate key -- like we have been doing with the Magento framework since the early days of v1. Am I missing something? I am sure it could be fixed with an observer or plugin, but it seems risky to start introducing these kind of data integrity risks into the framework for no reward. If you would like the ability to have varying SKUs for different warehouses or something, seems we should still be relating on a surrogate key, whether it is product_id, stock_item_id, or something else.