Open jfmengels opened 4 years ago
I think this is a good use case for https://package.elm-lang.org/packages/MartinSStewart/elm-codec-bytes/latest/
I plan on releasing a package called elm-serialize
which is improves upon elm-codec-bytes
. I can write encoders/decoders with it and release it under elm-syntax-serialize
(I'm already writing a elm-geometry-serialize
and this is the naming scheme I've settled on).
As I mentioned in https://github.com/stil4m/elm-syntax/pull/55#discussion_r440613435,
elm-review
parses every file in a project, and then caches the resulting AST by storing it on the disk. When it restarts, those files are then read to avoid having to parse the file again.Over at my work project, we have 160k LoC over 600+ modules. When all of this gets cached, the combined disk space used for all these AST is about 39MB, which is a lot! (The raw source code is about 5.7MB big, FYI).
I think decoding and encoding the same data but using
elm/bytes
would reduce the amount of space taken. And since reading from disk is (relatively) slow, would speed startup time, also probably the time spent writing this data to disk.I don't have hard data on how much space and time this would save, but I imagine it will be smaller several folds, as we will be able to store data much more compactly than with JSON.
Since
elm-syntax
's AST is not opaque at all, we can try this out in a separate package (or directly inelm-review
for that matter), and potentially keep it there forever to avoid havingelm/bytes
as a dependency ofelm-syntax
if that is something we wish to avoid. One of the problems forelm-review
though, is that this data needs to be sent over a port, but ports don't support Bytes. A workaround I heard of is toAnyway, I wanted to share this need/want of mine. I'll likely tackle this at some point unless someone beats me to it (I have other things to work on for a while :sweat_smile: )