widgetti / solara

A Pure Python, React-style Framework for Scaling Your Jupyter and Web Apps
https://solara.dev
MIT License
1.9k stars 140 forks source link

Reactive with SqlAlchemy models causes strange behaviours #684

Open mangecoeur opened 4 months ago

mangecoeur commented 4 months ago

SqlAlchemy provides powerful object models for objects mapped to database, and with version 2.0 supports models mapped as dataclasses.

Using sqlalchemy dataclass models within reactive shows some odd behaviours.

new_els = component_container.value.copy()
new_els.remove(component)
db.delete(component)
component_container.set(new_els)
import solara
from solara.toestand import Ref
from sqlalchemy.orm import MappedAsDataclass, DeclarativeBase, Mapped, relationship, mapped_column
from sqlalchemy import ForeignKey

class Base(MappedAsDataclass, DeclarativeBase):
  __abstract__ = True

class Parent(Base):
    __tablename__ = "parent"
    id: Mapped[int] = mapped_column(primary_key=True)
    name: Mapped[str]

    children: Mapped[list["Child"]] = relationship(default_factory=list, back_populates="parent")

class Child(Base):
    __tablename__ = "child"
    id: Mapped[int]= mapped_column(primary_key=True)
    name: Mapped[str]

    parent_id: Mapped[int] = mapped_column(ForeignKey(Parent.id), init=False)
    parent: Mapped[Parent] = relationship(back_populates="children")

pr = solara.Reactive(None)
pr.value = Parent(id = 2, name="parent2")
c = Child(id = 2, name="child2", parent=pr.value)
print(len(pr.value.children))
print(pr.value.children)

# This should only modify the name attribute of child
Ref(pr.fields.children[0].name).set("child2-new")

# But instead causes a duplicate of the child to exist in the parent container
print(len(pr.value.children))
print(pr.value.children)

Keeping only one side of the back_populates reference and constructing children using parent_id instead of parent object works as a workaround. The suspicion is that the reactive update calls dataclass.replace(), which causes the child object to be re-constructed with the parent as an argument which triggers sqlalchemy to add the child object again as if it was a new object.

import dataclasses
import solara
from solara.toestand import Ref
from sqlalchemy.orm import MappedAsDataclass, DeclarativeBase, Mapped, relationship, mapped_column
from sqlalchemy import ForeignKey

class Base(MappedAsDataclass, DeclarativeBase):
  __abstract__ = True

class Parent(Base):
    __tablename__ = "parent"
    id: Mapped[int] = mapped_column(primary_key=True)
    name: Mapped[str]

    # children: Mapped[list["Child"]] = relationship(default_factory=list, back_populates="parent")
    children: Mapped[list["Child"]] = relationship(default_factory=list)

class Child(Base):
    __tablename__ = "child"
    id: Mapped[int]= mapped_column(primary_key=True)
    name: Mapped[str]

    parent_id: Mapped[int] = mapped_column(ForeignKey(Parent.id))
    # parent: Mapped[Parent] = relationship(back_populates="children", init=False)

pr = solara.Reactive(None)
pr.value = Parent(id = 2, name="parent2")

# Insted init the child with the parent id, instead of parent object
c = Child(id = 2, name="child2", parent_id=pr.value.id)

# the child is not yet added to the parent
print(len(pr.value.children))
print(pr.value.children)

# add it by setting the collection
Ref(pr.fields.children).set([c])
print(len(pr.value.children))
print(pr.value.children)

# This should only modify the name attribute of child
Ref(pr.fields.children[0].name).set("child2-new")
c = dataclasses.replace(c, name="child2-new2")

# no duplicate of child is created in this case.
print(len(pr.value.children))
print(pr.value.children)
mangecoeur commented 3 weeks ago

Coming back to this, since we've been having some issues making sure both sqlalchemy updates and solara updates happen. Here are just some general thoughts for future reference.

The problem is that sqlalchemy thrives of mutation, with fancy change handling so it can push minimal updates to the database. But solara does change tracking through object equality checking which means often you need a new copy of an object to force an update. It also merges using the merge_state function that makes copies.

I'm wondering if it would work to set a custom merge_state function where we would supply the sqlalchemy session and use it's own mergefunction. Currently this isn't easily possible because merge_state is set as a default value for the merge callback in ValueBase, but is not passed as a keyword argument from the inheriting classes (KernelStore) or the enclosing class (Reactive._storage).

There is also use_sync_external_store which looks like it could do the trick using sqlalchemy events.