Closed leovalais closed 1 month ago
Model
trait. When using macros the i64
will be used.
(infra_id, obj_id)
.obj_id
and an infra_id
.InfraModel
could be reused) to generate this code.let track_schema: TrackSection = ....;
let infra_id = 42;
TrackSectionAsChangeSet::from_schema(track_schema, "my_track_id", infra_id).create(conn).await?;
TrackSectionAsChangeSet::from_schema(track_schema, "my_track_id", infra_id).update(conn).await?;
TrackSectionModel::delete(("my_track_id", infra_id)).await?;
TrackSectionModel::retrieve(("my_track_id", infra_id)).await?;
Deref
for model objects to easily retrieve the schema. Move persist_batch
out of schema
objects so that these structs do not embed any DB related behaviors
LegacyModel
to NewModel
one by one.LegacyModel
In order not to forget, we could also add an attribute macro #[model(jsonb = true)]
in order to wrap/unwrap diesel_json::Json<>
automatically. That would make utopia's #[schema(value_type = T)]
for each DieselJson<T>
disappear.
#[derive(Model)]
struct Foo {
#[model(jsonb = true)]
bar: Baz
}
// roughly equivalent to
struct Foo {
bar: diesel_json::Json<Baz>
}
@laggron42 wdyt?
Definitely would be useful indeed, if it's an option!
Add a setting #[model(additional_derives(row(...), changeset(...)))]
Scratch that. That's just additional configuration for rows and changsets, this is probably better:
#[model(row(type_name = "LOL", derive(...)))]
#[model(changeset(type_name = "LOLZ", derive(...)))]
TODO! Also auto derive an Exists<Pk>: Retrieve<Pk>
with a blanket impl Model::exists(pk).await? <=> Model::retrieve(pk).await?.is_some()
Currently Models are (almost) a 1:1 mapping to their respective table schema. Therefore foreign references are i64s. This is problematic for several reasons:
There are three common kinds of associations (cf. Django's documentation of comprehensive examples):
(rolling stock <-> livery, infra object <-> search table, ...)
(scenario *-> study, study *-> project, livery *-> document, ...)
. One side of this association is provided by diesel by the #[diesel(belongs_to(Parent))]
annotation. But that forces us to use diesel's dsl, loading manually a second request, it's impossible to use belonging_to
with something different than the id, etc...Since the models only model the ends of the relations, we only have to focus on implementing what "One" and "Many" means, on both sides of the association.
Additionally, depending on the context we might want to resolve these references either eagerly or lazily.
Study
#[derive(Clone, Debug, Serialize, Deserialize, Derivative, Model, ToSchema)]
#[derivative(Default)]
#[model(table = "study")]
pub struct Study {
pub id: i64,
#[derivative(Default(value = "Some(String::new())"))]
pub name: String,
#[derivative(Default(value = "Some(String::new())"))]
pub description: String,
#[derivative(Default(value = "Some(String::new())"))]
pub business_code: String,
#[derivative(Default(value = "Some(String::new())"))]
pub service_code: String,
pub creation_date: NaiveDateTime,
#[derivative(Default(value = "Utc::now().naive_utc()"))]
pub last_modification: NaiveDateTime,
pub start_date: Option<NaiveDate>,
pub expected_end_date: Option<NaiveDate>,
pub actual_end_date: Option<NaiveDate>,
#[derivative(Default(value = "Some(0)"))]
pub budget: i32,
#[derivative(Default(value = "Some(Vec::new())"))]
pub tags: Vec<String>,
#[derivative(Default(value = "Some(String::new())"))]
pub state: String,
#[derivative(Default(value = "Some(String::new())"))]
pub study_type: String,
// allows race conditions, true by default, otherwise LazyRef locks the row of self in the transaction, if it still exists
#[model(lazy_ref, lock_ref = false)] // , key = project_id)]
pub project: LazyRef<Project>,
// since it's not declared lazy, they will ALL be loaded within the same transaction as the retrieved study
#[model(foreign_ref_many, inverse = Scenario::study_id)] // or inverse = (Obj::infra_id, Obj::obj_id)
pub scenarios: Vec<Scenario>,
pub project_id: i64, // we can still access the project id directly
}
let s = Study::retrieve(42, conn).await?;
let p: &Project = s.project.get(conn).await?; // the project is loaded once at usage
assert_eq!(p.id, s.project_id);
println!("{}", s.scenarios.first().unwrap().name); // since scenarios are loaded eagerly, they're directly available
Lazy refs could be wrapped in a LazyRef<>
type which would be an enhanced OnceLock<>
that performs additional operations before querying, such as locking rows. For the eager reference, I couldn't find a reason why we would need a wrapper.
For relations with "many" objects, I'd like to leave it to .collect()
to infer the type of the collection, and to allow maps using an attribute include_key
. (We can make that an issue on its own afterwards.)
RETURNING *
) / Create the rowNo changes, cascading deletions should be handled by the database as the Model system is not designed to maintain the bijection.
LazyRef<>
lock_ref = false
, check that the parent model still exists, locking itMany references are roughly a batch_retrieve
invocation.
Inverse associations are trickier to parse, but since the inverse
annotation is supposed to be an Identfier
of the referred Model, we should still be able to call {,batch_}retrieve
as usual.
Obviously, inverse associations should be excluded from changesets and patches. But for direct associations, we have two options:
Study::Changeset
has a fn project(project_id: i64)
method and a project_id: i64
field). It's definitely realizable, but it involves a little bit of logic, especially with compound identifiers.project_id
in Study
). It's ugly, but easier.I prefer the first option.
Recursive by default? Opt-out option?
#[model(foreign_ref, key, inverse)]
#[model(lazy_ref, lock_ref = true, key, inverse)]
#[model(foreign_ref_many, include_key = false, inverse)]
#[model(lazy_ref_many, include_key = false, lock_ref = true, inverse)]
// All annotations accept a key and/or inverse attribute
key = <identifier>
inverse = <qualified identifier>
// where
<identifier> = <field> | (<field>,*)
<qualified identifier> = <model>::<field> | (<Model>::<field>,*)
Add a configuration option for both row and changeset to include additional diesel derive configuration.
fn flat_field(self, Option<T>)
Exists<K>
trait (maybe via a Count<K>
trait)Changeset<T> = <T as Model>::Changeset
and Row<T> = <T as Model>::Row
to avoid the "ambiguous type" error (rust compiler being too strict)API proposal to integrate ORDER BY
queries into ModelV2.
#[derive(Debug, Default, Clone, ModelV2)]
#[model(table = crate::tables::document)]
pub struct Document {
pub id: i64,
pub content_type: String,
pub data: Vec<u8>,
}
let docs/*: Vec<Document> */ = Document::list(
&mut conn,
ListSettings::new() // <query> WHERE content_type = "plain/text" ORDER BY content_type DESC, id ASC LIMIT 25 OFFSET 50
.filter(Fields<Document>::ContentType.eq("plain/text")) // Box<dyn BoxableExpression<table, Pg, Bool>>
.order_by(Fields<Document>::ContentType.desc()) // Box<dyn BoxableExpression<table, pg, SqlType = NotSelectable>>
.order_by(Fields<Document>::Id.asc())
.limit(25)
.offset(50)
).await?;
let ops: HashMap<(i64, String), OperationalPointModel> = OPM::list_with_key(...).await?;
That would almost replace modelv1::List
manual implementations. Breaking changes would include the name sorting criteria for projects which translates to lower(p.name)
in SQL. So by using the new API above we would slightly change the ordering. I'd say it doesn't really matter, as we can always (as another issue, iff deemed necessary):
Best to do that before splitting models in its own crate, that way we don't need to keep PaginatedResponse<>
into it (as it should belong in the views crate).
Prevents the application of changesets without values for non-default columns in Create.
#[derive(Debug, Default, Clone, ModelV2)]
#[model(table = crate::tables::document)]
pub struct Document {
pub id: i64,
#[model(required)]
pub content_type: String,
pub data: Vec<u8>,
}
Document::changeset().data(vec![]).create(...).await.is_err() // true
Not necessarily useful since diesel will complain anyway in the create()
. Up to discussion.
We need to be able to forward derive annotations to the changeset (and maybe the row) at both struct- and field-level. We hit this issue while trying to implement Validator
for Changeset<RollingStockModel>
with a custom validator function.
[!NOTE] Whether we choose to keep Validator or not is another problem. We might also encounter this problem with Derivative for instance.
One can take inspiration from how strum_macros forwards annotations to the enum generated by EnumDiscriminants (https://docs.rs/strum_macros/latest/strum_macros/derive.EnumDiscriminants.html).
let mut doc = Document::retrieve(conn, 42).await?;
// Locking for update (other queries can only read the pk) FOR NO KEY UPDATE
doc.exclusive_lock(conn, |conn, doc: &mut Document| async { // lock_for_update: &mut Self, &mut Self -> Result<()>
doc.data = vec![1, 2, 3];
doc.save().await?;
}).await?;
/// ---
let infra = Infra::retrieve(conn, 42).await?;
// Shared lock to prevent modification (other queries can read the whole row, but cannot update it) FOR SHARE
infra.lock(conn, |conn, infra: &Infra| async { // lock: &mut Self, &Self -> Result<()>
infra.import_railjson(...).await?;
}).await?;
Each lock opens a new transaction (required by for the locking system to work), locks the row and runs the closure. We don't need FOR UPDATE
because models PKs are not supposed to be changed. I don't think we need to consider FOR KEY SHARE
but I may be wrong.
More info on locking here: https://www.postgresql.org/docs/current/explicit-locking.html#LOCKING-ROWS
List
V2 required to split models in a crateNo need to wait for locking to migrate the Infra model (there are currently locking queries for infras, but they're not run in a transaction, so I'm not sure they work at all).
Enhance the
Model
macroCurrent state
Currently, when applied to a struct, the
Model
macro can generate a bunch of implementations to factorize common DB operations:trait Retrieve
trait Delete
trait Create
trait Update
There is two approaches on how to use it:
Application on one struct (most common in editoast)
Pros:
Model
was originally meant to be usedCons:
Option<>
id
,creation_date
, etc.) cannot be used properlyOption<Option<>>
or by blurring the distinction (when possible)deserialize_as
unwrap()
s everywhere...INSERT
or anUPDATE
unwrap()
everywhere (bis), and let's not talk about the pain it is to review such code where you have to wonder whether eachunwrap()
is normal and never fails or not.Application on two structs (for pathfinding and maybe train schedules)
https://github.com/DGEXSolutions/osrd/blob/c8a1770ca54769c13e51c60931707d364e1ec781/editoast/src/models/pathfinding.rs#L21-L100
Pros:
Option<>
fields only where they truly represent an optional valuedeserialize_as
andunwrap()
Cons:
PathfindingChangeset::create
andPathfindingChangeset::update
return aPathfindingChangeset
instead of aPathfinding
Proposal
I'd like to write something like that instead (:santa:):
While the actual API here is just a proposal, there are, I believe, a few essential points that an enhanced
Macro
model should at least cover:Option
and clutteringModel
ling infra objectsIn the case of infra objects, only a few models are implemented right now (atm only catenaries, OPs and tracks) but more will come :mage: and the implementation can be made more satisfactory ⚙️.
I.e.: for track sections, atm:
https://github.com/DGEXSolutions/osrd/blob/c8a1770ca54769c13e51c60931707d364e1ec781/editoast/src/models/infra_objects/track_section.rs#L9-L20
There are several problems:
(id, infra_id, obj_id, data)
is duplicated for each model(infra_id, obj_id)
rather that by row ID, but the traitsRetrieve
, etc. do not support that identificationretrieve
,create
, etc. all work using ainfra_model::TrackSectionModel
(the boilerplate) and not using aschema::TrackSection
(which is the struct we'd like to use)Proposal
models::infra_objects
altogether and focus on what's defined inschema
.InfraModel
derive macro so that it also:(id, infra_id, obj_id, data)
thatderives(Model)
and references the schemaRetrieve
,Create
andDelete
(notUpdate
, see below) directly for the schema, but using(infra_id, obj_id)
instead of the row id directly.(infra_id, obj_id)
and to forward the call to the actualModel
As far as I can tell, infra objects are only updated in
schema::operation
(osrd editor requests) that updates the DB directly. Hence, for now I suppose there is no need to design a way to update infra models in a Rust fashion. But should we need to, the separationInfraModel
(schema) <->Model
(DB row) could allow us to design an elaborate schema changeset builder where chained method calls would map into a set of (JSON path, update value), who knows?Likewise, I've pinned two ways infra objects are created. Using the
persist_batch
function (current behaviour ofInfraModel
) during infra import and in unit tests.persist_batch
directly queries the DB, I think it should stay that way (I can elaborate if that's necessary :smile:). In unit tests, it's convenient to have acreate()
method, but there is no need to handle the case of autogenerated column values for infra objects (we only need to provide the whole schema). So here as well we don't need to bother with changesets and elaborate builders.More advanced operations
Copy
A copy function that generates a changeset with pre-filled values may prove useful. Any thoughts?
Batch operations
Before starting to make a list of all traits + fns prototypes that the macros described here should generate, I'd like to hear your thoughts on both the general ideas and the API proposal above.
@flomonster