guillaume-be / rust-bert

Rust native ready-to-use NLP pipelines and transformer-based models (BERT, DistilBERT, GPT2,...)
https://docs.rs/crate/rust-bert
Apache License 2.0
2.6k stars 215 forks source link

WASM support #92

Open genderev opened 3 years ago

genderev commented 3 years ago

I'm creating this issue because I couldn't figure it out on my own.

  1. Can rust-bert support WASM?

The issue I had trying to refactor this crate on my own is that the idea of resources relies heavily on a file system (which doesn't exist in WASM). I was unsuccessful in trying to refactor this crate to fetch the model from the huggingface site instead of downloading it.

I understand that the contributors of this project may not have the bandwidth to refactor rust-bert for WASM. Therefore:

  1. How do you actually use a model after you download it?

Do you have any resource recommendations that explain how to use a trained model to generate or classify text in a language agnostic way? This would be useful knowledge for if I make my own crate. Thanks!

guillaume-be commented 3 years ago

Hello @genderev ,

Regarding 1., I unfortunately lack the WASM experience to support you in this matter. Please note however that this crate relies on tch-rs bindings to the C++ libtorch library. This in turns seems to rely on hardware-specific compilation that may be problematic for WASM (see https://github.com/LaurentMazare/tch-rs/issues/256 or https://github.com/LaurentMazare/tch-rs/issues/85 for example). Before looking into potential issues with the resources and use of the file system, maybe it would make sense to create a hello world of sorts with a basic tensor calculation using tch-rs?

Regarding 2., there are some examples in the documentation on how to use custom model weights (see for example https://docs.rs/rust-bert/0.10.0/rust_bert/gpt2/index.html). The resources are returning a path to the file location, which you can use to load the configuration, tokenizer and model weights.

Do you have any resource recommendations that explain how to use a trained model to generate or classify text in a language agnostic way? This would be useful knowledge for if I make my own crate. Thanks!

Could you please clarify if by language agnostic you mean independent of the language (e.g. English, French, Spanish...) or independent of the target language (e.g. Python, Rust,...)?

genderev commented 3 years ago

@guillaume-be

Before looking into potential issues with the resources and use of the file system, maybe it would make sense to create a hello world of sorts with a basic tensor calculation using tch-rs?

Definitely makes sense. On my to-do list.

Could you please clarify if by language agnostic you mean independent of the language (e.g. English, French, Spanish...) or independent of the target language (e.g. Python, Rust,...)?

I meant independent of the target language (e.g. Python, Rust). This is what I meant: How do you create a processing pipeline with the models for text generation, question answering, classification etc. in any programming language?

Thanks!

guillaume-be commented 3 years ago

Could you please clarify if by language agnostic you mean independent of the language (e.g. English, French, Spanish...) or independent of the target language (e.g. Python, Rust,...)?

I meant independent of the target language (e.g. Python, Rust). This is what I meant: How do you create a processing pipeline with the models for text generation, question answering, classification etc. in any programming language?

Thanks!

You may want to have a look at the ONNX ecosystem. This allows interoperability between the most popular frameworks today. Conversion script examples for some language models from PyTorch to ONNX is available (see https://github.com/huggingface/transformers/blob/master/src/transformers/convert_graph_to_onnx.py).

Pipelines such as text generation, question answering and others go beyond the actual model, and include complex pre-and post-processing step. These would typically be build in the specific language you are targeting

genderev commented 3 years ago

Thanks for that link to ONNX. However, I'm interested in learning about this field by writing my own library. Also, I found a PyTorch frontend that compiles to WASM.

Pipelines such as text generation, question answering and others go beyond the actual model, and include complex pre-and post-processing step.

I'm interested in learning about the processing steps. What are the compex pre- and post-processing steps? Is this a transfer learning problem?

guillaume-be commented 3 years ago

Hi @genderev ,

For an overview of the pre-and post-processing steps, I would recommend having a look at the Transformers' library (implemented in Python). You will find implementation for end-to-end pipelines in https://github.com/huggingface/transformers/blob/master/src/transformers/pipelines.py and the generation routines in https://github.com/huggingface/transformers/blob/master/src/transformers/generation_utils.py

xloem commented 1 year ago

This package appears to provide for pytorch in wasm: https://crates.io/crates/wasm-nn . Additionally https://crates.io/crates/burn is written in rust.

aguynamedben commented 1 year ago

This is also interesting: https://github.com/visheratin/web-ai. It downloads models and stores them in IndexedDB via https://github.com/localForage/localForage

mikkel1156 commented 9 months ago

Should this be possible with the new ONNX backend (noticed today that it got added) or are there still some parts that rely on the torch API?

guillaume-be commented 9 months ago

Unfortunately the tch binding are still needed since the crate relies on it for all tensor operations outside of the model (pre and post-processing). Rewriting the pipelines to have a non-tch version (e.g. with candle or ndarray) would be possible, but I do not have the bandwidth for such an undertaking at this point. A PR would assuredly be welcome if someone wants to look into this