Open mikesol opened 2 months ago
Hey there, yes, I still use this for https://mero.chat so it is maintained.
The library is older than visible type applications so yeah I am open to anything that could speed up compilation/simplify the API.
Cool, I'll make a PR this week 👍
Nice!
Started taking a look this morning 🌞
It's tricky because the proxies are used to build up lots of stuff. For example, there's no clean VTA way to turn u ... id
into "u" ... "id"
unless you do @("u" ... "id")
and make ...
an infix type operator
The VTAs would need to apply to even larger types, meaning you'd need @("u" ... "id" # As "id")
. That gets thorny because As
needs to pull double-duty, meaning both @("u" ... "id" # As "foo")
and @("id" # As "foo")
are valid. While it's possible to construct these types, it means that the constrains need to move from ToAs
on as
to whatever is consuming the VTA, ie select
. But then you have two problems:
select
instead of the as
.Fail
.The endgame is that you wind up basically reconstructing an SQL parser at the typelevel, which is a stone's throw away from parsing a typelevel string containing SQL. The latter could be pretty fun - that's how, for example, I made pursx
in Deku. I always copy/paste SQL from ChatGPT, so this'd be a lifesaver.
Anyway, I digress. While VTAs are cool, they create a yak that just gets hairier and hairier the more you try to shave it. What do you think?
So your plan was to use type applications instead of proxies for the selected columns?
While currently every column must have a name the only place that actually requires aliases is when joining. This is a gigantic hack so maybe it could work if you moved all the logic to the last minute checks in ToQuery.
which is a stone's throw away from parsing a typelevel string containing SQL
I remember trying this at first and it being way too slow to compile (and harder to compose). Maybe now it is better? I had a revaluation of the current approach in my todo list for the semi near future. Breaking changes are fine as long as it is still type safe and (somewhat) composable.
The compiler has improved a lot, pursx compiles reasonably fast.
The nice thing about raw SQL these days is that models like GPT-4 and Claude are trained on bucketloads of it, so you go much faster if you can work with that.
I dunno about llms, but I'd take sql strings just cos it is more familiar to write. I sketched an approach with type level strings queries and vtas for both table definitions and parameters. Whenever I have the time I will try and implement it but it might not be soon
Nice!
I thought a bit about it as well. Here's my initial musings...
You can define a db's tables like so:
type User = { id :: String, active :: Boolean, username :: String }
type Friendship = { id :: String, user1 :: From User "id", user2 :: From User "id" }
And then construct a world that contains all of the tables as proof:
db = world @(User : Friendship : Nil) -- typelevel list
Then, at some point you ask GPT to construct your query: https://chatgpt.com/share/02e8d2b9-cf73-4ab6-9435-f09adfb1ebe3.
Paste it in:
res = query db @"""SELECT
u1.id AS user_id,
u1.username AS username,
u2.id AS friend_id,
u2.username AS friend_username
FROM
Friendship f
JOIN
User u1 ON f.user1 = u1.id
JOIN
User u2 ON f.user2 = u2.id
WHERE
u2.active = true;
"""
And `res` would automatically be of type `Array { user_id :: String, username :: String, friend_id :: String, friend_username :: String }` from introspecting the query.
GPT basically never gets these queries wrong, so it may be enough just to focus on the columns to construct the return type without needing to validate any of the other stuff.
Actually, even what I'm proposing above may be overkill. If you do correct RAG, you don't even need that. Check out https://chatgpt.com/share/4289e5f6-72e0-450a-812e-e0fa8aa086e7 . It correctly generates:
query db @(Array { id1 :: String, id2 :: String }) @"""
SELECT f.user1 as id1, f.user2 as id2
FROM Friendship f
JOIN User u ON f.user2 = u.id
WHERE u.active = true
"""
At which point the PS framework can be super small.
If you just need something to automatically parse query results into objects there already exists Droplet.Driver.Unsafe
that does exactly that.
I was thinking of how to keep the interface type safe but switch out to type level strings. The table and columns and whatnot could stay as it is. I think the only new heavy lifting would be parsing the sql query string.
Yeah, exactly. The table stuff wouldn't change much. The typelevel parser would be annoying for the reason writing any parser by hand is annoying. If you get it into an a typelevel ADT, validating it against a schema and generating the return type should be less painful.
validating it against a schema
Do you mean querying the database schema?
Yeah, for example if you parse the typelevel String into a typelevel ADT that has a SELECT statement of columns foo and bar from table baz, the next step (still at the typelevel) would be to look at the schema and verify there is in fact a table baz with columns foo and bar (and also check their types if there are where statements or joins).
Ah okay, so how it already works. I thought you meant asking the actual database, which is a thing I saw some haskell library do
Today I started putting together a little project to rewrite the HTML parser in Deku. It's going to be a full-fledged typelevel parser. When I finish it (maybe over the weekend) take it on a spin and see if it works with a rudimentary SQL ADT. If so, it'd be great if a single typelevel parser could be used for both projects.
Here's the repo: https://github.com/mikesol/purescript-tldr
There are a few tests already of the combinators and matchers.
Nice! That looks great. I am a bit busy with https://github.com/easafe/purescript-flame/pull/87 but after that I will try it for sure
Just finished the initial work on the repo. It's completely undocumented, so whenever you have time to dip into it, let me know and we can sync on how it works. That'll be a good time for me to add docs and format error messages a bit better. There's tests for everything, though.
A few things to know:
purescript-parsing
naming conventions.Fix
operator. That'll be important for the WHERE
clause. There's an example in the tests.That was fast; incredible work. It should make it easier to write a more proper type checker too. How are you finding compilation times so far?
The compilation times are manageable so far, the tests have lots of examples and compile reasonably fast in my IDE.
Had a bit of time to stress test the lib with CREATE
statements: https://github.com/mikesol/purescript-tldr/blob/a4d0b73c67ab7a093c9c6d1ba248263b97f06f11/test/Test/Sql1.purs#L77. It's quite fast, I didn't notice any lag in my IDE on recompilation. My hope is to establish the world of tables this way, which will make it easy to use tools like apgdiff
for migrations. What'd be even better is if vite
could somehow inject a full .sql
file into the type so that you could keep all the table definitions outside of PureScript.
If you hover over mytables0
in the IDE, it generates this type, which IMO is usable for downstream parsing:
Proxy (Success (Cons (TableDef (Proxy "foo") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "lastname")) SText) (Cons (ColumnDef (Column (Proxy "firstname")) SText) Nil)))) (Cons (TableDef (Proxy "bar") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "time_created")) SInt) (Cons (ColumnDef (Column (Proxy "nickname")) SText) Nil)))) (Cons (TableDef (Proxy "baz") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "username")) SText) Nil))) (Cons (TableDef (Proxy "qux") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "user1")) SText) (Cons (ColumnDef (Column (Proxy "user2")) SText) (Cons (ColumnDef (Column (Proxy "user3")) SText) (Cons (ColumnDef (Column (Proxy "user4")) SText) (Cons (ColumnDef (Column (Proxy "user5")) SText) Nil))))))) (Cons (TableDef (Proxy "goo") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "last_updated")) SInt) Nil))) (Cons (TableDef (Proxy "fooA") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "lastname")) SText) (Cons (ColumnDef (Column (Proxy "firstname")) SText) Nil)))) (Cons (TableDef (Proxy "barA") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "time_created")) SInt) (Cons (ColumnDef (Column (Proxy "nickname")) SText) Nil)))) (Cons (TableDef (Proxy "bazA") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "username")) SText) Nil))) (Cons (TableDef (Proxy "quxA") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "user1")) SText) (Cons (ColumnDef (Column (Proxy "user2")) SText) Nil)))) (Cons (TableDef (Proxy "gooA") (Cons (ColumnDef (Column (Proxy "id")) SInt) (Cons (ColumnDef (Column (Proxy "last_updated")) SInt) Nil))) Nil)))))))))) "")
Thats pretty good. I will take a look sometime this week
What solution are you currently using @mikesol ? I didnt forget about this, just havent had the time yet
Do you mean for SQL? I'm using https://github.com/mikesol/oooooooooorrrrrrrmm .
I was looking for a PS postgres library and this seems like it could be a really good fit! Is the library still maintained? If so, would you be open to using VTAs instead of proxies in some places?