Closed skyzh closed 17 hours ago
The next steps:
We should split the SQL migration things into a new crate, and potentially use a ORM, though I think an ORM is too heavy in our case -- I don't think we want to worry about schemas changes of optd-core for now.
On the memo table side, we can continue adding new functionalities: logical props, winner, etc.
And, with the current memo table, we can already implement 4 out of 5 cascade tasks (optimize input needs to know and update the winner)
Plus, I think we can start documenting whatever we have now + cut down the features we need on these crates. i.e., we should only use sqlx any
or impl Connection
interface in the core, and let the cli to decide which database backend to use.
There are also a few optimization opportunities in the memo table -- creating indexes, better merge group (can we really do lazy merging on SQL?)
I don't know how detailed you want -- could you please help document PlanNode
to set a common expectations for us?
I don't know how detailed you want -- could you please help document
PlanNode
to set a common expectations for us?
Every single field should have an explanation on why we need it to be there and how code should interact with it. For example, I had no idea that typ: T
is a generic type tag that other developers are allowed to choose. I thought for the longest time that the generic type WAS the tag. That is a subtle difference that is very not obvious, because when you see a type like Foo<T>
, you expect it to have some sort of "wrapping" behavior around T
(think Vec<T>
or Option<T>
), but that is not what is happening here.
Why do the children and predicates need to be vectors? What does "materialized" actually mean? I now know what it means, but if someone comes along and looks at it, they would have no idea.
These are some of the things that should be written down. Or if it is already written down, there should a link to where it is.
Thinking about the new start point of optd, let's build something from a MVP instead of getting everything working at the first place.
We have naive + persistent memo table. The memo table doesn't store cost and properties for now. It finds the duplicates. It's async.
On the persistent side, we use sqlx to run the queries in in-memory sqlite to eliminate external dependencies when running tests. The persistent memo table currently supports dedup predicates.