vacationlabs / haskell-webapps

Proof-of-concept code for a typical webapp implemented in various Haskell libraries/frameworks
MIT License
134 stars 21 forks source link

Using Haskell's type-system to implement caching (and cache-invalidation) correctly #15

Open jfoutz opened 8 years ago

jfoutz commented 8 years ago

If Haskell can't introspect on the database it can't help with static analysis. There are two obvious ways to fix this, one is to parse SQL. That's not crazy, Parsec is a nice library. But it seems like a lot of work. Alternatively, You could use a combinator library sort of in the spirit of the old HTML library, https://hackage.haskell.org/package/html-1.0.1.2/docs/Text-Html.html.

The idea is, make a typeclass with all of the sql keywords (or just a few very important keywords to start). One implementation is just a pretty printer, that sql gets sent to the database directly. So the text would look something like

select ["name", "age'] from ["people"]

The pretty printer would, of course, spit out

select name, age from people;

The point is, a second implementation would provide a data structure that could be walked to determine what tables and rows depend on specific queries. So it should be possible to dump cache related to modified columns.

As a first cut, I don't think where clause '=' cases would be to tough to do row level invalidation. Which is the vast majority of updates and deletes. the other cases '<,>,<=,...' could simply invalidate the whole table.

saurabhnanda commented 8 years ago

@jfoutz is the problem more complex because you're using HDBC and not a DSL which wraps over SQL? For example, would this be easier to implement in something like a Persistent, where the update, insert, or replace API calls know exactly which table & which row in the DB they're mutating?

PS: Shall I merge #12 into this?

jfoutz commented 8 years ago

Yes. that's it exactly. It might be a lot easier to do the row caching trick in Persistent, if you can get a hold of the internal data structure that represents the queries, and wrap the implementation with the custom caching backend.

PS: maybe? no strong feelings either way. I guess this is one way the type system can help with caching. So this is probably a subset of that ticket.

saurabhnanda commented 8 years ago

@jfoutz do we want the type-safe system to help us cache objects and invalidate them, I.e. actual operations. Or, do we want it to force us to think about cache invalidation whenever an object tagged as "cacheable" at the type-level is updated?

The latter might be easier to do with a monad or effects library. Much like IO.

jfoutz commented 8 years ago

IO makes guarantees. The DSL i'm suggesting would make guarantees, and actually do the work.

I'll think about what we might provide to remind people of caching issues. If something is tagged as cacheable, perhaps the developer could be required to implement a flush cache method, or something along those lines. Not an actual guarantee, but a method that must be implemented.

That's a little risky, as the DB schema evolves all of the flush-cache methods would have to be revisited. There are some implications there i don't quite understand yet.