loopbackio / loopback-next

LoopBack makes it easy to build modern API applications that require complex integrations.
https://loopback.io
Other
4.94k stars 1.06k forks source link

Do we want a new Juggler? #537

Closed kjdelisle closed 6 years ago

kjdelisle commented 7 years ago

I've been thinking about the persistence story in LoopBack over the last few months and have been wondering if the juggler is the right approach for the framework.

Warning: Opinions Ahead -- Please don your 3D glasses now.

Background

In the current iteration of loopback and loopback-datasource-juggler, we define relations between Models to provide much of the interoperability that users have come to rely on for making their REST APIs. These relations are supported by a Domain-Specific Language (DSL) that leverages that metadata to generate queries against datasources.

The Everything ORM

In many cases, our mapping of the DSL has required us to make an ORM-like set of facilities that generate query strings (in the case of SQL) or translate our DSL into one understood by the target database/database drivers (MongoDB, as an example). Support for this idea falls short of what many in the development community would expect for each of the individual use cases; SQL queries are often limited to primitive, inefficient operations and some NoSQL query objects do not accurately represent their original intention when translated accordingly. In many cases, users must fall back on basic methods that do not take advantage of the metadata our relations and DSL represent.

In each of these cases, whether or not it was our intent, we have implied to the community that we would shoulder the burden of providing a reasonably-complete and efficient set of query-generation tools to work hand-in-hand with our DSL to provide fine-grained control to various datasources.

In my opinion, the very idea of having our own DSL gives this impression; what purpose does it serve otherwise?

The Extensibility Problem (Routing around the Damage)

In the case of loopback-next, our extreme extensibility is a massive advantage, but it comes with the a significant "downside"; members of the community will simply avoid the Juggler if it doesn't meet their needs, and those that find it sufficient will keep using it.

"Great! Why's that a problem?"

Our Choice Informs Our Design

The decision of which approach we consider to be the best practice will have a direct effect on the design of the other core components of the framework. As an example, one of the other things that loopback 3.x gives you with its relations is automatic generation of REST APIs based on your models. "Foo hasMany Bar"? Great! We'll generate all of the /foos/{id}/bars routes for you!

If we decide that there will be no Juggler for 4.x, then that probably means no relations, too. That doesn't mean we won't have a story for generating those API definitions in some other way, but it does give you two very different approaches.

"We can keep relations without using the DSL or the juggler for persistence!", you might say. But would you want to create all of the relationships between your Models in a declarative format just to generate the shape of your REST APIs? Typing /foos/{id}/bars into a Swagger editor isn't any more difficult than making those relations. The combined value of getting that for free as well as being able to leverage that metadata to generate queries against datasources is what makes the sales pitch for the idea of having those relations to begin with.

DSL? I Prefer Cable, Thanks

Convincing community developers to invest great effort into creating these middleware layers that translate our DSL into commands for various database drivers is a tough sell when ready-made ORMs can be liberally sprinkled into your code in loopback-next.

Many of them come with their own Model and relationship engines (sequelize being the example everyone is tired of me bringing up whenever we discuss this), and since they tend to specialize in their chosen domain, they're often very efficient, well-written ORMs that do one (or a handful) of things extremely well.

Competing with this would require us to once again take on the burden of building all of these connectors ourselves, only this time, we'd have to make damn sure that we're making solid-quality SQL statements, and properly translating our MongoDB queries because developers will simply swap to using the underlying drivers if they're not already in too deep. Combine this with the fact that we just don't have the resources to build all of this in a timely fashion and the value-for-money of this proposition suddenly seems thin at best.

What Other Circus Acts Are We Good At?

The juggler is definitely something that distinguishes us from other frameworks that are value-adds on Express.js, Koa.js and so on. I find myself wondering what other killer features we could provide that existing frameworks don't. It might be that what we need to differentiate ourselves in a meaningful way doesn't even have to be a particular feature; if our framework and tooling do a better, faster job of getting people from idea to API, and gives you easy flexibility to add all sorts of awesome components, wouldn't that be our killer feature?

Trade Chainsaws for Bowling Pins

It might be that someone who's much more creative/intelligent than me knows of a way we can have our cake and eat it too; if a design for the new juggler provides a low-effort API to implement that bridges the gap between driver and DSL, then I'm definitely on board with the idea.

You may now remove your 3D glasses.

I definitely want to hear everyone's thoughts on this, both the validity/invalidity of my concerns, as well as approaches we could take to solve this problem. Thanks for reading!

raymondfeng commented 7 years ago

The current version of loopback-datasource-juggler is overloaded with many responsibilities, such as:

When we discuss what should be or should not be supported out of box in LB next, these different perspectives should be separated to avoid confusions.

We have planned for refactoring the juggler into separate modules. The @loopback/repository package for loopback-next is also a starting point for the effort.

bajtos commented 7 years ago

Let me start by admitting my mixed love/hate relationship with juggler.

On the brighter side

Having the standard ORM in LoopBack allowed us to implement great tools simplifying building of LoopBack apps:

On the darker side:

There is a reason why the SQL language cannot be used for NoSQL databases: each NoSQL database has a different approach to address the CAP theorem and therefore requires a different programming model and mindset. For example, MongoDB prefers partial updates using operators like $inc in order to achieve data consistency. OTOH, CouchDB/Cloudant does not support partial updates OOTB and maintains a revision (SHA hash) of every document to guarantee consistency.

My takeway

In LB Next/4.0, we are decoupling REST API from the ORM API. This will take away benefits like single definition for both REST API and ORM, while bringing other advantages like giving developers tighter control of there public REST API.

In that light, I think we are pretty much ready to abandon juggler, if we can find a way how to preserve the following features:

While we are discussing alternative ORMs for SQL backends, I'd like to bring the following projects into attention:

jannyHou commented 7 years ago

ORM

IMO the biggest challenge to build an ORM would be we have to either build a "perfect" one or don't do it.

By "perfect" I mean:

Compare with other existing ORMs from community, it's not hard for us to come up with a better design&implementation in a specific area, but given the resources we have, my concern is how much time do we need to build an OVERALL better ORM... And if we turn to be more determined on closing features that are not reasonable for us to support, would that benefit users more than telling them from the beginning to spend some time on investigating the most appropriate libs/modules they need in product? And a bottleneck of developing with the current juggler is: some standard are too strict(e.g. ad-hoc sort) across 10+ connectors, if we still expect to have unified behaviours, I would suggest to only officially maintain connectors for ibm databases and the most popular ones: db2, cloudant, mysql, mongodb. Actually considering the incoming request from paying customers, this is still an increasing list :(

Sugar functions

IIRC we have a story discussing simplifying functions provided in dao.js, I understand that sugar functions to some extend saves people time, while again...thinking of the effort to maintain them and some similar functions make people confused what is their difference, then it becomes another overhead of documentation and a compatibility debate * N(the connectors we support)

Remote method hook

Actually it's now implemented in loopback core, I love the hook system and I assume loopback-next already implements it.

Scope

People may still want to have a set of apis organized under a certain name or say tag, and also easy to reuse when extending model.

Inclusion, Getter and Setter, 2nd level Cache

Inspired by this article and the "updateOnly" PR recently merged into juggler, I think what limited by our current resources are those things lead us to build SQL/NoSQL queries, but we still need a module serve as a middleware between the modelDef and a db's driver functions.

raymondfeng commented 7 years ago

To echo @jannyHou's comments, I propose that we first build a list of features/responsibilities for the current loopback-datasource-juggler to better understand what it does today so that we can better decide what it should do/should not do for LoopBack next. We need to keep/improve the good parts and remove/fix the bad parts.

Having a big-bang/wholesale yes/no debate is NOT going to be very productive, IMO.

kjdelisle commented 7 years ago

I don't think it was ever about all of the juggler; many of the parts of juggler v3 have already been spoken for as separate modules within loopback-next, like authentication. I'm mostly using the term juggler for the persistence and relations since they don't have their own names.

I do agree that we should make that list anyway, though figuring out exactly what is affected by the greater whole is difficult to talk about, and easier to demonstrate, which is why we're working on a "real" app to start testing out these use cases: https://github.com/strongloop/chit-chat

ExTheSea commented 6 years ago

Sorry if this out of scope for this issue but I just want to ask whether auto-discovery of models and relations is still part of the planned feature set of the new Juggler (which it seems to be heading towards)?

Discovering models and relations based on existing tables is a major part of our current workflow with Loopback as we have hundreds of "old" tables that need CRUD APIs. Hand writing each model and method would make this framework almost unusable here. I saw someone else asking this in a referenced issue but he didn't get an answer (https://github.com/strongloop/loopback-next/issues/419#issuecomment-314490175)

kjdelisle commented 6 years ago

@ExTheSea This is one of those design decisions that would be influenced by the way we choose to implement and support the persistence layer.

If we decide to continue providing our own ORM, it would mean that we would also be responsible for the discovery and migration stories that are a part of loopback@3.

My current proposal is to use mixins for popular ORMs, as well as templates to help auto-generate code for users based on their chosen protocol (REST, gRPC, MQTT, etc) and chosen mixin. We're currently hashing that out as a team and any feedback for either approach would be welcome.

If you have any questions about what my proposal would entail, just ask. :)

kjdelisle commented 6 years ago

So, as a team, we came to a decision yesterday regarding our approach here and this is what we've come up with

Roadmap for Juggler

We will be keeping the Juggler as a part of LoopBack, but we will be constraining its scope for the next major release.

Planned Changes

Other ORMs

We will provide some tutorial materials on how to create your own mixins to make use of your own ORMs, though we will not provide templating support or other materials to ease in the use of those ORMs.

kjdelisle commented 6 years ago

Additional Questions

cc @strongloop/loopback-devs

kjdelisle commented 6 years ago

Another question: Will we constrain the number of relationships to something simpler than before?

virkt25 commented 6 years ago
jannyHou commented 6 years ago
bajtos commented 6 years ago

Thank you @kjdelisle for writing down the proposal, and @virkt25 and @jannyHou for your comments. I'd like to add few more thoughts to consider.

First of all, I think we should make juggler a first-class package that can be used outside of LoopBack too. We have interesting features that are not available in other ORMs - see e.g. https://github.com/strongloop/loopback-next/issues/776#issuecomment-349976735 and the feature comparison between TypeORM and Juggler that @raymondfeng wrote but which I am not able to find now :( (@raymondfeng - could you please post link to your table here?) We should be promoting our ORM more too, so that when people learn that LoopBack uses Juggler as the default ORM, they won't think "why are they using this ORM I never heard of instead of ", but instead they will understand Juggler is a well-known fully-featured ORM and we have to pick one anyways.

Convert juggler to TypeScript

I want us to work on the "new" juggler incrementally. I really want to avoid the situation we have here in loopback-next, where we spent 12 months building a new version from scratch and there is still nothing that our users could use in production.

Instead, I am proposing the following approach:

Will the new Juggler live in the monorepo? +1 for having the new juggler live in the monorepo.

I personally see a lot of value in having a monorepo that contains Juggler and all connectors we are maintaining. In my past experience, it was cumbersome to add new features to Juggler, because a PR to juggler would have to be accompanied by 10+ pull requests to our connectors to implement support for that new feature. Sharing the test suite between juggler and the connectors had it problems too, how often we could not land a pull request in one repository because the tests were failing until another pull request was landed somewhere else?

Having the ability to test all connectors together with any change made in juggler will simplify our life too, as we won't have to rely on cis-jenkins dependency-based-triggers anymore. (cis-jenkins has two issues: a) it can be slow to start downstream jobs b) test results are not visible to community (non-IBM) contributors ).

The downside is that running all connector tests will add significant time overhead to npm test and CI runs. However, I think this problem is solvable by CI tooling. For example, we could write a tool that will check git patch of the changes we are testing, decide which packages are affected (either directly or by changes in their dependencies) and then run the tests only for those affected packages.

What I think is a more important question is whether Juggler and connectors should live in loopback's main monorepo, or whether they should have their own monorepo? If we want to promote Juggler as standalone ORM, then it may make more sense to let it have its own monorepo, own issue tracker, etc. (Another benefit of a different monorepo is that we can defer implementation of the CI tooling I mentioned above for a while, because npm test in loopback4 monorepo will stay fast).

Last but not least, I think we should find a new name for our ORM, perhaps one that's not so coupled with LoopBack. How about "Juggler ORM"? (I am already imagining a cheerful logo of a circus artist juggling with balls 🤹‍♀️🤹‍♂️, where each ball can be a logo of a different SQL/NoSQL database.) Few more alternatives that come to my mind: "Strong ORM" to keep StrongLoop's theme of prefixing modules with "Strong", "LoopBack ORM" to keep the association with LoopBack, or perhaps @loopback/juggler.

I would like to see the number of relationships simplified to start with and more can be added depending on use cases and needs. This should help achieve consistency, simplicity and maintainability. For relations, one way to simplify it is probably separating the constraint apart from relation, like we only have 1:m relations (hasMany embedsMany referenceMany) but apply another constraint layer to realize 1:1. Just a thought, need more time to think of it.

+1 for simplifying things. I think there will be many more opportunities to simplify things. For example, embedded relations have always had a lot of shortcomings, they may be a good candidate for removal too.

bajtos commented 6 years ago

We discussed the next steps for this issue with @kjdelisle and come up with the following plan:

  1. Monorepo: https://github.com/strongloop/loopback-next/issues/890

    • connectors + juggler + dependencies like loopback-filters
    • a different repo than loopback-next to keep lerna bootstrap fast enough
    • preserve original repos - we will keep LB-3.x codebase there
  2. Migrate juggler to typescript: https://github.com/strongloop/loopback-next/issues/891

  3. Migrate individual connectors to typescript too: https://github.com/strongloop/loopback-next/issues/892

  4. Drop callback APIs, use Promises only: https://github.com/strongloop/loopback-next/issues/896

  5. Spike: Remove data-access APIs we don't want to support anymore, both from juggler and connectors, e.g. updateOrCreate, findOrCreate, etc. https://github.com/strongloop/loopback-next/issues/897

  6. Spike: what to do with EventEmitters (Observables?) https://github.com/strongloop/loopback-next/issues/898

  7. Semver-major release of everything (alpha pre-release or preferably a 0.1.0 release if we change names from loopback-* to @loopback/*)

We will create follow-up issues later. There should be a special Epic to group these issues together.

bajtos commented 6 years ago

We (@kjdelisle and me) have created follow-up issues with the exception of the step 7, we will handle publishing as part of our regular work.