lazear / sage

Proteomics search & quantification so fast that it feels like magic
https://sage-docs.vercel.app
MIT License
201 stars 38 forks source link

Support compiling to WebAssembly #77

Closed TheLostLambda closed 11 months ago

TheLostLambda commented 1 year ago

Heya! This looks like a sick project!

As part of my PhD I'll be spending some time improving the tools available for peptioglycomics (https://elifesciences.org/articles/70597), and I'll have 4ish months to dedicate full time to that. While I'll have to write a fragment predictor on my own, sage could be an outstanding engine for actually finding the fragments I predict!

Long term though, I plan on building a Web GUI and either running the Rust backend natively through Tauri (https://tauri.app/) or, even better (for accessibility), on WebAssembly. If all of the Rust compiled down to WASM, I could just host a free Github pages website and serve a no-install GUI to anyone interested in using our tool!

While Rust's support for WebAssembly is generally outstanding, some low-level APIs can't be compiled to the WASM platform! Doing some preliminary testing with sage (cargo build --target wasm32-unknown-unknown), I get errors from Mio (https://github.com/tokio-rs/mio) which is pulled in by Tokio and ultimately sage-cloudpath.

I'm not sure if there are other issues lurking, but to be honest, the cloudpath support (if I understand correctly that that's what AWS support depends on) seems like an appropriate bit of functionality to put behind a feature flag! That way we can keep AWS support in by default (and my Tauri app could even use it), but I could also disable the feature in my Cargo.toml to compile everything down to WebAssembly!

If you have the time to play with things so that they compile to WebAssembly, that would be great, otherwise I'll consider this a note to myself and something to work on in a few months' time!

Thanks again for the outstanding tool and super helpful blog post about it!

lazear commented 1 year ago

I'm definitely interested in supporting wasm as a target - in fact, I have had a similar idea before. I've been tinkering with an egui based visual debugger for Sage/spectrum visualizer. One of the reasons I chose egui is because it has really good web support... and running a full featured search engine and spectrum visualizer in the browser would be awesome (this is definitely a stretch/long-term goal though).

I just cloned into a fresh folder and ripped out tokio and was able to compile for the wasm32 target. Most of the project follows a "functional core - imperative shell" pattern: the sage-cli crate handles reading mzML files, writing outputs, etc. This allows other rust based applications (or python bindings) to handle their own input formats, etc. The one deviation from this pattern is that the sage_core crate is reading fasta files itself (which depends on sage-cloudpath, and also contains the mzML parsing code (which depends on quick-xml and tokio).

We could either put tokio behind a feature flag, or just lift the mzML parsing and fasta parsing out of the sage_core crate entirely - I think this might be the cleanest/most ideologically pure way to do it. I will play around with this when I get some free time and see how it feels - please reach out when you are ready and we can have a quick chat to figure out any other internal API changes that might need to be made.

TheLostLambda commented 1 year ago

I like your idea of pulling out those IO components into their own crate (or somehow bundling them with the other IO) — that's a similar pattern to how rust-bio does things too! They have a central functional library and then some other crates providing a CLI and code for reading high-throughput sequencing files. That seems to work well for them and meant that I could use rust-bio from WASM in a previous project easily! I think I'd prefer that to the feature flag too!

I look forward to updates from your end, and I'll let you know once I start working on this part of my project full-time!

Thanks again for the outstanding work and responsiveness!

lazear commented 11 months ago

You should be able to compile the sage-core crate for a wasm32 target now! I'll continue to work on refactoring some of the IO bits, and see how much can get put behind feature flags - I'd be interested in collaborating on some wasm stuff if/when you are ready (I am a total wasm noob!).

TheLostLambda commented 11 months ago

That's super exciting! I might not be super free until nearer December, but I'm definitely up for working on getting things working in WASM then - I'll be working on building a WASM MS pipeline of my own then, so I'd be more than happy to also help contribute features to Sage that make it even easier to integrate as a library into other projects.

Thanks for all of hard work and exciting progress!