Open sayakpaul opened 1 year ago
Plus, we are also open to other topics related to Machine Learning Pipeline and MLOps system to demonstrate various MLOps scenarios with some Google technologies such as TFX(e2e ML pipeline framework), Vertex AI Pipeline, etc.
Also, besides model deployment to Vertex AI as mentioned by @sayakpaul, we could talk about TensorFlow Serving in general too. In this case, the topic would be ML model deployment with TensorFlow Serving to local and GKE environments.
Plus, we are also open to other topics related to Machine Learning Pipeline and MLOps system to demonstrate various MLOps scenarios with some Google technologies such as TFX(e2e ML pipeline framework), Vertex AI Pipeline, etc.
Yes, a great idea for sure. We have a couple of end to end workflows already implemented. We can definitely consider them for lecture -- discuss motivation, design strategies, key components, etc.
I believe it's important to help the students develop a mindset for dealing with these different scenarios rather than going through codebases from the get go.
I agree with upto
However, it should not be too much depends on Google or AWS tech. If we can use more general tech, it would be better.
If we have substantial benefits from doing so, it's also OK.
We can briefly cover the rest of the points as the other service providers (Azure, AWS, etc.) also offer something similar.
The point is that if there's a better offering, it should be explored since it lets a practitioner focus better. With a managed solution like Vertex AI, authentication, autoscaling, traffic splitting, etc. become easier and more seamless. Hence, the idea.
I kept it to show this point of using targeted managed solutions. It can be realized with other service providers too, but @deep-diver and I are most comfortable with GCP.
I think one or two lectures with GCP is OK. However, do you think we can get some GCP credit for students?
Also how can VertexAI be used to train sentence transformer in our demo? I will train the sentence transformers using STS.
Also how can VertexAI be used to train sentence transformer in our demo? I will train the sentence transformers using STS.
Please refer to: https://cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-train-and-tune-pytorch-models-vertex-ai
I think one or two lectures with GCP is OK. However, do you think we can get some GCP credit for students?
Happy to condense it within one. I think you'd need to reach out to your local Google Developer Group community manager for this. As far as I remember GCP has a student tier already which you might wanna check out.
Please refer to: https://cloud.google.com/blog/topics/developers-practitioners/pytorch-google-cloud-how-train-and-tune-pytorch-models-vertex-ai
This is good. This + serving would be a good one lectures (2 hours).
@deep-diver I also love to have serving comparisons including all possible serving means.
So should we do two lectures (weeks)?
possible serving comparison + evaluation
Elaborate.
@sayakpaul @deep-diver Specifically, Nov 4 and 18. Do you think it's possible?
Fine by me.
I am good with the schedule too :)
@deep-diver fastapi, sagemaker, GCP serving (vertex), KServ, Airflow, etc. If there is more, I think we can add them. The more, the better.
Showing everything is not a very good idea IMO. Presenting a workflow and realizing that with a particular set of services is more feasible and approachable. Besides, our (@deep-diver and mine) expertise is in GCP and its related components. So, we will need to discuss this further.
@hunkim Since this is a Univ. lecture, I think it is not ideal to discuss the implementation details and how to use a specific tools/frameworks. Rather, it would be better to discuss considerations and challenges, and then show how one can use Google tech to handle them as an example.
Here is an example of serving
part:
I think one or two lectures with GCP is OK. However, do you think we can get some GCP credit for students?
one possible way is to let students create a free GCP account before the lecture (we can get $300 free credits every time we create a new GCP account)
@deep-diver Thanks for the outline. Overall, it's OK, but it seems too much for one (two-hour lecture). Can we fully focus on the serving part? Assume we have a wonderful model, for example, a small sentence transformer in this repository.
How to serve? FastAPI, Airflow, KServ, ......
After that, we need to talk about the performance measures. How to measure and what would be the results using a small example?
Then, I would talk about caching, versioning, multi-model serving(+A/B testing), GPU utilization, optimization on different CPU architectures, choosing the right number of threads for message queuing and inter/intra operators, batch inferencing, etc.
After the deployment, what is the next step? This would be a good topic to go over briefly.
I guess @sayakpaul will talk more about the model training side using GCP. Then, could you introduce the GCP a bit and show how to join and get the credit.
Thanks.
@sayakpaul Could you also outline your two-hour lecture? I really appreciate your help.
@sayakpaul @deep-diver Are you going to introduce Kubernetes at some point in your lectures?
We both will be covering the serving part.
Kubernetes and Docker will be introduced in the serving lectures. If you want to introduce them in any previous lectures that's fine too.
After that, we need to talk about the performance measures. How to measure and what would be the results using a small example?
After measuring predictive performance common things to measure is latency and throughput via conducting a load-testing. We have experience with that. So, we will introduce it.
I don't think introduction to GCP and how to sign up for free credits should be made a part of the lecture -- rather they should be homeworks for the students. This allows us to get more time to focus on the conceptual things related to serving.
@sayakpaul So we won't cover training part?
How would you divide the serving parts into two lectures?
Specifically, Nov 4 and 18.
Kubernetes and Docker will be introduced in the serving lectures. Sounds good.
After measuring predictive performance common things to measure is latency and throughput via conducting a load-testing. We have experience with that. So, we will introduce it.
Wonderful!
rather they should be homeworks for the students. This allows us to get more time to focus on the conceptual things related to serving.
+1
How would you divide the serving parts into two lectures?
Sorry for my late reply. I was away for a short trip.
I think model training is a broader topic than serving. I am unable to think of a structure that would be suitable for the lecture series. But if you have ideas we're all ears.
I suggest @deep-diver and I both do the lectures. That way it will be more fun and interesting. Needless to say, it will help with workload distribution too.
@hunkim
Since @sayakpaul and I take one or two parts of the MLOps lecture, I think training
part is not about the usual training part but re-training or adjustment according to the model or data drift, and possible AutoML/Hyperparameter sweeps (like AutoKeras, KerasTuner) can be integrated into the MLOps system.
@deep-diver
re-training or adjustment according to the model or data drift, and possible AutoML/Hyperparameter sweeps (like AutoKeras, KerasTuner) can be integrated into the MLOps system.
sounds good.
Can you check out https://github.com/DSA-MLOPS/main/tree/main/main and see if we can use this example for our lectures? For example, deploy it in the Google cloud and add mode advanced servings.
For re-training or adjustment can also be done using this sentence transformer model.
It looks like sentence_transformers
is built on top of PyTorch. I think we need TensorFlow model for our parts to leverage GCP and TensorFlow Serving. WDYT @sayakpaul ?
Could you @hunkim describe a bit more about the course?
I think it is better to keep the lectures not too specific about the model or task.
It looks like sentence_transformers is built on top of PyTorch. I think we need TensorFlow model for our parts to leverage GCP and TensorFlow Serving. WDYT @sayakpaul ?
We can't leverage TF Serving then. We can still use GCP and Vertex AI and other things like GKE. But not TF Serving.
lectures not too specific about the model or task.
Sure. But for the models, pytorch would be OK. If necessary, we can mirror the same thing (sentence transformer) using TF.
@sayakpaul @deep-diver Unfortunately, we will open this class next semester (Spring 2023). We will have more time to prepare.
I will let you know when I have a fixed schedule for Spring 2023. Thanks.
Okay.
@sayakpaul @deep-diver I will offer this course in Spring 2023. May I have the honor of having you as guest lecturers in our class for the following two topics? The lecture time is 2:30 PM KST
(3/31) Google cloud infra (training/serving) (4/14) Docker, k8, KFlow, KServe, Airflow, performance evaluation
@hunkim sorry for my late reply. The first half of this year will be quite busy for me. I will have to pass this one for now.
@sayakpaul I understand.
Then may I ask if you can talk about your work: Huggingface?
(3/17) Hugging face (TBA)
@deep-diver Do you think you can cover any of these?
(3/31) Google cloud infra (training/serving) (4/14) Docker, k8, KFlow, KServe, Airflow, performance evaluation
@hunkim
I think I can cover the basic infrastructure of GCP(i.e. Pipeline, Artifact Store, Training, Serving) and system software(i.e. Docker
, Kubernetes
) related to MLOps, but I don't think I can cover KFlow
, KServe
, Airflow
.
However, I could cover TensorFlow Extended(TFX) with some use cases. Since TFX is an end to end ML pipeline framework, it covers almost every components of the entire MLOps workflow while lots of other tools try to solve a specific topic. Furthermore, TFX is great when used with GCP.
@deep-diver wonderful. Can you cover 3/31?
I also saw you are covering stable diffusion with hugging face or something. Do you think we can talk about that in 3/17 if @sayakpaul is too busy to talk about huggingface?
Thanks!
@hunkim
Can you cover 3/31?
I think so. Is that going to be held online?
Do you think we can talk about that in 3/17?
I am not really positive on this. SD is somewhat diffucult topic for me to cover
Sure, I can talk about it :) Sorry for my late reply.
@deep-diver and I can share the Google Infra for training / serving.
@sayakpaul @deep-diver wonderful! I have fixed the schedule. It's all zoom and Friday 1:30PM-4PM HKT.
(3/17) Hugging face (Sayak Paul, Hugging Face) (3/31) Google cloud infra training/serving (Chansung Park/Sayak Paul)
Thanks! I will invite you to google calendar soon.
@sayakpaul look forward to meeting you next week. Would you mind sharing your lecture title and abstract? It's three hours lecture slot, so you can talk for one hour + some hours of huggingface exercise, or you can use the full three hours.
See you soon.
@deep-diver, @hunkim
Given our experience with using GCP for various end-to-end ML workflows, I think we could cover the following things:
I would mainly try to discuss the conceptual and architectural components of the above since I think that is where the students will find the most value. They will have a better understanding of the approaches which they could further implement them. Of course, we will supplement our lectures with code so that they always have references.
WDYT?