mathesar-foundation / mathesar

Web application providing an intuitive user experience to databases.
https://mathesar.org/
GNU General Public License v3.0
2.41k stars 332 forks source link

Spec for a better way to pass data between service layer and data layer #823

Closed silentninja closed 12 months ago

silentninja commented 2 years ago

Problem

We need to pass data back and forth between the service layer(django, drf, and model methods) and the data layer(db module for now).

Right now the data is passed as function arguments, but it lacks a proper structure, is hard to pass polymorphic objects, lacks validation. This would enable us to have a valid contract(interface) with other modules and prevent leaks.

Proposed solution

Got few ideas for data structure

  1. Named tuples - Named tuples offer a simple immutable data structure and could also be inherited using @property decorator. Conversion can be done using adapter functions
  2. Dataclass - Quite flexible and has lots of features that could suit our needs like frozen dataclass, abstract methods, allows methods. This could provide serialization and validation methods.

Additional context

Proposed solutions can be mixed depending on the situation, doesn't have to be an either-or The data passed between layers should preferably be a primitive as this prevents unnecessary leaks

https://www.stavros.io/posts/fastapi-with-django/ - This article does a similar transformation in the Models section, which could be helpful

Related to:

Few ideas and libraries that could be helpful in extending DRF https://testdriven.io/blog/django-and-pydantic/ https://marshmallow-code.github.io/django-rest-marshmallow/ https://github.com/jordaneremieff/djantic

dmos62 commented 2 years ago

I'm happy you're bringing this up and I fully support it. I arrived at the same conclusion and I think we should get on this earlier rather than later.

I created a dataclass model for constraints when doing the interview task and it was a big productivity boost. Here's the definition. It also made deserialization easy.

kgodey commented 2 years ago

This issue is somewhat related:

mathemancer commented 2 years ago

We should agree on

For the first, I think the data layer should provide classes for the service layer to instantiate and use, and I think they should do their own validation so they can be used without the service eventually.

dmos62 commented 2 years ago

I agree that the data layer (db namespace) should provide these basic data structures. I think dataclasses are superior to namedtuples for the use case. I agree that their correctness/validation should be part of their definition.

What's the refactor strategy? I see these being implemented opportunistically. For example if I'm doing a PR that does anything more than the most trivial juggling of database objects, I'll write the abstractions for the objects in question (if they're not yet written). I'll refactor the code that I directly rely on for the PR, but might leave some code using the obsolete "bag of arguments" formalism. In other words, refactor when extending or maintaining.

kgodey commented 2 years ago

What's the refactor strategy?

I think that someone should assign this issue to themselves and write a spec for it, once we've all reviewed and agreed on the approach, we can either refactor opportunistically or create new issues for implementation. I've tagged this issue as restricted: maintainers and updated the issue title to reflect that the expected output is a spec.

I would like to finish backend work on the data types milestone before starting on this, though.

github-actions[bot] commented 1 year ago

This issue has not been updated in 90 days and is being marked as stale.

seancolsen commented 12 months ago

I'm closing because this is very old and seems outdated, given more recent discussions and work on this topic.