absmach / magistrala

Industrial IoT Messaging and Device Management Platform
https://www.abstractmachines.fr/magistrala.html
Apache License 2.0
2.41k stars 664 forks source link

Use ORM to abstract SQL DB #1232

Open drasko opened 3 years ago

drasko commented 3 years ago

FEATURE REQUEST

  1. Is there an open issue addressing this request? If it does, please add a "+1" reaction to the existing issue, otherwise proceed to step 2.

  2. Describe the feature you are requesting, as well as the possible use case(s) for it. Mainflux Users, Things and other services use PostgreSQL to store users, channels and things entities and other metadata.

For the reasons of portability and bigger deployment choice, it would be good if we can use ORM to abstract DB operations - something like this: https://github.com/volatiletech/sqlboiler.

That would make interesting scenarios possible:

  1. Indicate the importance of this feature to you (must-have, should-have, nice-to-have). Nice-to-have
drasko commented 3 years ago

@chombium @nmarcetic any opinions?

dborovcanin commented 3 years ago

I'm not sure about this. We can always add multiple repository implementations. It's not as flexible as magic abstraction, but it provides better control and, since we have fairly simple database schemas, should be sufficient for our needs. We need to have in mind that multiple repository implementations will probably introduce new dependencies for corresponding database drivers.

nmarcetic commented 3 years ago

Tricky question :)

In general, I am OK with ORM's, where there is a real need to simplify queries and code (a lot of code can be simplified). I don't think we have a real need, our queries are simple, pure, consist SQL's that most people know. Adding layer of complexity with ORM brings a new learning layer with syntax and tag's magic etc...This is hell to debug. Also, this is huge work because it must be done in all services than in order to get a more generic deployment solution (pick DB engine, which I see as a bit of question should we do it. There is a good reason why we choose every piece of the stack). DB engine abstraction that you are mentioning is a really nice and good trade-off, but with Go sql/database already add's a nice abstraction layer, you just need to pass db driver. Performance issue can also be questionable, usually ORM's are slower but people says safer if you are not really sure what you are doing. I would say No, more cons then pros for now.

chombium commented 3 years ago

I agree with @dusanb94 and @nmarcetic. The ORM might bring some nice and fancy apis for working with the databases, but who knows what it has inside. On the other side we already have well defined interfaces for CRUD operations, so if someone wants to use some other database it's really easy to write a custom implementation. The only thing that we might accept as a contribution to the core project would be some kind of flexibility for the user to select which kind of db should be used, but for something like that I would wait till someone comes with a need for a different db and accept a PR. Than we can say that PostgreSQL is the default db, but there are other community maintained db implementations which can be used. I find the possibility to use different dbs nice, but that would include another external dependency hard to debug. I would also say no to that. If someone needs something like that we can give guidance about the implementation.

drasko commented 3 years ago

Agreed. Only thing that I would really like to see is a kind of abstraction around Things and Users microservices DB (so - not messaging, but metadata DBs) to replace them with SQLite for on gateway use-cases. This was my main motivation - while we provide different DBs for messaging, and it is easy to deploy them - I would really like to see on SQLite for all microservices for single-instance deployment on a modest gateway with 32MB flash. I know this is possible - just needs some work (although even PostgreSQL would work, it's not that big).

nmarcetic commented 3 years ago

@drasko Agree, PostgreSQL is not that heavy, I assume it will work nicely on GW. Again, with current impl you just pass SQLlite DB driver and everything will work. So if really have an edge case where you need SQL lite, changes are minimal, it will be very easy to replace it.

chombium commented 3 years ago

@drasko thanks for the clarification. I will take a look at the db interfaces, check if we use some PostgreSQL SQL queries and test if everything still works with another dbs (I'll start with SQLite).

bbokun commented 3 years ago

@drasko I wanted to share my experience about this. We needed to use Microsoft SQL Server as the database engine for authn, things and users, as the company has restrictions for the databases which can be used, and PostgreSQL is not supported. I managed to create the services to support this. As @nmarcetic wrote, the sql/database (or sqlx/database) has a abstraction layer which already gives the possibility to "just" use another driver, and this works nicely. Unfortunately, I had to copy/paste/change the contents in the appropriate postgres folders of the services because of the following facts, and duplicate the code in the cmd folder to create new services because of:

Beside that, the CRUD operations are the same - a "INSERT INTO something", "UPDATE something" or "DELETE FROM" is the same for all SQL Databases, and there we got code duplication which could maybe be avoided.

Maybe, usage of ORM is too heavy for the entities stored in SQL Databases as they are really tiny, but maybe some kind of abstraction could be implemented to make database connectors easier to implement.

@chombium, @dusanb94: FYI

chombium commented 3 years ago

@bbokun Thanks for sharing your experience.

I'm totally aware that using ORM will make some part of the database interactions much easier. The thing with the ORMs is that most of them support one type of databases read mostly relational databases. If we decide that we only support relational databases ORM would be the obvious way to go as most of them support multiple databases with same or similar SQL dialects or have a mechanisms to add support for new databases.

In most of the control systems there is also need for local control loops which are instructed from a central control system where a distributed database might come really handy, the new buzzword of the new kids is Edge Computing :). The data consistency and synchronization between the edge and the cloud is important in such cases and some of the databases can simply not handle that. There are also some distributed databases with really great data segmentation, synchronization and time-series capabilities which might be great things for a specific use cases.

If we decide that we want to support only relational databases, an ORM would be good, there will be some learning time to understand how it works, but than it will make the things easier.

I still think that it would be better if we have a nice defined interface(s) for communication with the database and let the users decide which database they want to use. We can publish few (2-3) implementations for different databases as part of Mainflux and let the users build theirs and publish them to some common db connectors repo (at list we should list them the db connectors somewhere in the docs).

bbokun commented 3 years ago

@chombium Thanks for Your words, and I see it the same way You see. I also think that introducing ORM would bring overhead into services which should be light.

My last sentence "but maybe some kind of abstraction could be implemented to make database connectors easier to implement." means exactly what You wrote in Your last paragraph - but You said it a much nicer way :D

Yes, definitely it would be the best way to implement it with nice defined interfaces, but to do so, we need to analyze and identify the differences between the most popular databases (in a way similar to what I reported for MS SQL Server).