flojoy-ai / studio

Joyful visual programming for Python
https://docs.flojoy.ai
MIT License
193 stars 19 forks source link

RFC Model deployment #839

Open jjerphan opened 1 year ago

jjerphan commented 1 year ago

Context, uses-cases and motivations

This RFC stems from discussions in https://github.com/flojoy-ai/studio/issues/823.

Model deployment is a common necessity in many programmatic workflow. The main frameworks all have interfaces to easily download and use pre-trained models.

Hugging Face pipelines are higher levels abstractions which allow using several pre-trained models which already have been trained for a variety of tasks. They allow using the SOTA of Machine Learning models in a couple of lines, be in on CPUs, on a GPU or several of them, supporting various frameworks as back-end.

Due to the kind of problems Flojoy aims to tackle, such pipelines might be appropriate for problems users of Flojoy face. In particular, there exist Hugging Face pipelines for:

We might want to focus on providing an appropriate UX for those problems really well first.

We can then latter study how to configure GPU and CPU and usage and specifically how to process batch of data and have efficient training and inference.

Proposed scope: integrate the most useful Hugging Face pipelines

For all new nodes:

References

Roulbac commented 1 year ago

@jjerphan Thanks for opening this RFC! We have a few facilities to pull models from the HF hub and cache them under ~/.flojoy, example. Happy to work together on this.

I am curious what your thoughts are about a few "bring-your-own-model" generic nodes that support .torchscript and .onnx formats for classification, dense segmentation and audio classification.

jjerphan commented 1 year ago

We have a few facilities to pull models from the HF hub and cache them under ~/.flojoy, example. Happy to work together on this.

Good to know! Definitely, I am interested to join efforts so that we keep consistency in Flojoy's development.

I think this is approach is simple yet has a good granularity:

On my side, I started experimenting with https://github.com/flojoy-ai/nodes/pull/253 this morning to have a simple prototype to discuss. I am waiting for critical reviews.

I am curious what your thoughts are about a few "bring-your-own-model" generic nodes that support .torchscript and .onnx formats for classification, dense segmentation and audio classification.

I think everything depends on Flojoy users. I do not know Flojoy user-base, but I assume that Flojoy users might not be programmer and might just want to use a flow to solve their problem effectively with Nodes that are available of the shelf. In this case, supporting lower-level specifications and tools (like Torchscripts' and ONNX') might potentially have little final added value for end users now, but it might in the future. If I were to choose, I would not preemptively work on supporting those formats, yet I also think users which are developers can easily create their Nodes or even contribute back to Flojoy for the support of those formats.

What do you think?


Feel free to propose additions or rectifications to this RFC, and I will update it accordingly. :slightly_smiling_face:

jackparmer commented 1 year ago

I think everything depends on Flojoy users. I do not know Flojoy user-base, but I assume that Flojoy users might not be programmer and might just want to use a flow to solve their problem effectively with Node that are available off-the-shelves.

💯 Think biologists or mechanical engineers who have likely heard of Python but don't do any coding or even command-line work themselves.

jjerphan commented 1 year ago

Batching is not recommended when using Hugging Face Transformers' pipeline, especially on CPUs.

Processing batches of files can be done using the BATCH_PROCESSOR. For instance one can easily classify cats and upload the predictions to Flojoy Cloud. :black_cat:

Screenshot from 2023-08-31 18-01-27

jjerphan commented 1 year ago

Although slightly more advanced, approaches proposed in https://github.com/flojoy-ai/nodes/pull/234/ also are of relevance for model deployment since TorchScript model also are being used as interchangeable format.

Generally, we need to think about the UX for using models which have been serialized using: