[Post Proposal] ML Pipeline Example for ViT based Image Classification Task (TensorFlow)

deep-diver commented 2 years ago

Hi folks!

@sayakpaul and I have written three parts series blog posts about ViT model deployment on local, kubernetes(GKE), and GCP Vertex AI at HF blog platform a while ago.

At this time, we want to propose additional blog post series to help ML folks to expand their knowledge beyond ML deployment. Basically, we are going to show how to use sevenn major tools from TensorFlow ecosystem that are TensorFlow Extended(TFX), TensorFlow, TensorFlow Data Validation, TensorFlow Model Analysis, TensorFlow Serving, and KerasTuner. Additionally, all of the steps will be demonstrated to work on local and cloud environment(GCP Vertex AI and Dataflow) It is going to four blog posts 👇🏼 :

Basic: as the first step, we show how to build ML pipeline with the most basic components, which are ExampleGen, Trainer, and Pusher. These components are responsible for injecting raw dataset into the ML pipeline, training a TensorFlow model, and deploying a trained model.
Intermediate: as the second step, we show how to extend the ML pipeline from the first step by adding more components, which are SchemaGen, StatisticsGen, and Transform. These components are responsible for analyzing the structures of the dataset, analyzing the statistical traits of the features in the dataset, and data pre-processing.
Advanced Part1: as the third step, we show how to extend the ML pipeline from the second step by adding more components, which are Resolver and Evaluator. These components are responsible for importing existing Artifacts (such as previously trained model) and comparing the performance between two models (one from the Resolver and one from the current pipeline run).
Advanced Part2: as the fourth step, we show how to extend the ML pipeline from the third step by adding one more additional component, Tuner. This component is responsible for running a set of experiments with different sets of hyperparameters with fewer epochs, and the found best hyperparameter combination will be passed to the Trainer, and Trainer will train the model longer time with that hyperparameter combinations as the starting point.

Also, we will show how to put 🤗 related custom components in the pipeline. That is to push trained model to the 🤗 Model Hub, to push (create) Gradio application to the 🤗 Space Hub automatically.

All the above steps are done implemented, so if this proposal gets accepted, we can start writing about it. The URL of the on-going project repository: https://github.com/deep-diver/mlops-hf-tf-vision-models

@osanseviero maybe you could give some opinions about this?

deep-diver commented 2 years ago

@osanseviero

I think 4 blog posts are too long, so I think it would be better to compress the whole thing into 1 blog post with HuggingFace related stuffs in TFX pipeline:

How to preprocess the dataset appropriately: use feature extractor, re-ordering the indicies
How to train ViT model from HuggingFace's Transformers: explaning project strucutre, KerasTuner for hyperparameter Tuning
How to write a custom component: HFPusher to push trained model and Space application to the Hub

What do you think?

osanseviero commented 2 years ago

Hey there! Sorry for the very slow reply on my side. We were discussing this as four blog posts seemed like too much. After looking at the metrics, we think a blog might not be the best place for this information.

We're thinking of a new place in the hf.co/docs to place this more guide-style content of integrations with tools such as deploying, labeling, and other tools such as TFX. Would you be up to putting this idea on hold for some weeks until we figure out the best way to do this successfully? We think that by doing so the amazing content you create can have a much further reach.

deep-diver commented 2 years ago

Wow, It would be honorable for us to post something in HF's official document! Thanks for the suggestion, and we are looking forward to hearing from HF team.

deep-diver commented 2 years ago

@osanseviero is there any updated news?

osanseviero commented 2 years ago

Hey there @deep-diver! Sorry for the delay. No news right now as we're focusing in the spaces upgrades, but we'll discuss more about this in the next 2 weeks. Thanks again for wanting to contribute!

deep-diver commented 2 years ago

@osanseviero

no problem. please let me know when anything is decided. Cheers! :)

huggingface / blog

[Post Proposal] ML Pipeline Example for ViT based Image Classification Task (TensorFlow) #547