Open banditelol opened 1 year ago
Pydata is one of the better source of videos out there, while I still need to curate the content from time to time, but this is one of the best and practical one that I found. Also this pushed me to add Kjell Wooding to list of followed people.
In this talk he basically guide us step-by-step on the points that makes make a good tool for data science workflow and how to implement them. In total there are 7 points:
environment.yaml
for your conda stuffsmake
for your environment management, because face it you don't remember the snippet to create new environment from environment.yaml don't you?make update_env
or in poetry case you can use poetry add
to do similar stuff. The neat thing is both can be abstracted behind makefilelockfiles
, you still write what you want in human readable interface (environment.yaml, requirements.txt, etc) but you need additional file for the actual installed dependencies (env.lock.yaml, req-v1-2.lock.yaml, etc)This talk also linked with Love Your (Data Scientist) Neighbour - Amy Wooding | PyData Global 2021, which defines it with 6 stages of reproducibility issues:
README
and LICENSE
create_env
and env.yaml
Dataset
recipe ?make test
src
module by editable install
accompanying repo can be found hereI've been wanting to implement MLFlow for managing my ML services. But I haven't gotten the cognitive bandwidth yet to do it for real. So for now I'll go on to my consumption mode and enjoy the talk. Anyway I'll try to go along with the hands on if possible.
Kei Nemoto (github.com/box-key) is DS in Montefiore Einstein Center for health. And the code for this talk is in box-key/pydata-kubernetes-mlflow
What's additional problems introduced by this solution?
What is workers vs node? And What is kubelet?
deployments
.
kubectl apply
and a deployment will generate several pods. So this assume Nodes and Control plane is already set up?But how much of a bottleneck is this service? and how to scale this s rvice? Also is this only available inside Kubernetes
How should we define what is "the same model"
Welp. It's a lot more Kubernetes related than I expected. I thought it'll be more of a talk about practical MLflow-Kubernetes hands-on. Anyway the question on CIDR and how kubernetes network work lead me to this AWS page about VPC. It' worth a read to better understand how the IP Addressing component interact.
Conference Notes
This Issue will contains all the notes I took and additional interesting ideas I found while watching conferences either in person or online on Youtube