BurntSushi / fst

Represent large sets and maps compactly with finite state transducers.
The Unlicense
1.78k stars 126 forks source link

Compile to WASM #130

Closed claudius108 closed 3 years ago

claudius108 commented 3 years ago

Hi,

I tested your library by using the command line, in order to query for the words in a dictionary, and results are very fine, as to the query time, the possibility to search by using wildcards, and to the size of the index.

I would like to compile the library to WASM, in order to use it in browser (as a search engine for static sites), but its size is currently too large (30MB). Do you think the library can be split in a index build part, and a query part, with only the latter being compiled to WASM?

As an example of what I mean, I can mention here https://github.com/jameslittle230/stork, written in Rust, too, but which is not able to query as your library does.

Thank you, Claudius

BurntSushi commented 3 years ago

Do you think the library can be split in a index build part, and a query part, with only the latter being compiled to WASM?

The library almost certainly can be split. The question is whether it's worth doing and whether it would actually solve your problem. Here are my thoughts:

If the library really is 30MB as a WASM compiled artifact, then you'd probably need to do a lot of trimming beyond just "separate out the build phase." But to be honest, I have a really hard time believing 30MB to be correct. This library just isn't that big.

As an example of what I mean, I can mention here https://github.com/jameslittle230/stork, written in Rust, too, but which is not able to query as your library does.

To be honest, I'd expect Stork to do a lot better. An FST is itself not sufficient to build an IR system like Stork. An FST is merely a (possible) component of such a system. With that said, I've not done any investigation into how Stork works.

claudius108 commented 3 years ago

You are right about the size of the library. I changed [profile.release] / debug to false, and the size is now 5.8MB. Thank you!

I was thinking of separating the build part to query part and even to the command line part, in order to really have a smaller size to be loaded in browser as wasm file. If you say that those parts are not much in size, it is very fine.

Your library deals VERY nicely with queries. It is what I was looking for, and what I need. I mentioned Stork just as an example for the usage in browser of such a library.

I will try to compile your library to WASM, and I will let you know about the result.

Thank you, Claudius

BurntSushi commented 3 years ago

Yeah please do! I'd love to hear how it goes. Thanks.

claudius108 commented 3 years ago

I did some tests (https://rustwasm.github.io/docs/book/reference/add-wasm-support-to-crate.html), and it looks that compiling to Rust overpasses my current knowledge in Rust, which is zero.

Anyway, your lib is very, very nice...

BurntSushi commented 3 years ago

I don't do WASM myself, but it might help if you could find a small example project and just model it on that?

Note that I'm going to close this issue, but please feel free to keep commenting here if that helps.