numfocus / YouTubeVideoTimestamps

Adding timestamps to NumFOCUS and PyData YouTube videos!
https://www.youtube.com/c/PyDataTV
MIT License
79 stars 19 forks source link

AutoML: from data acquisition to predictions in production in a few clicks - Dr. Georgina Tryfou #120

Open AtriSaxena opened 2 years ago

AtriSaxena commented 2 years ago

Video link: https://www.youtube.com/watch?v=Zth8ZG9q4EY

Video Title: AutoML: from data acquisition to predictions in production in a few clicks - Dr. Georgina Tryfou

Contents:

0:00 Waiting 4:45 About PyData Cyprus 8:55 Waiting 9:38 Welcome 10:21 Mission of ML 11:12 What do we offer? 11:54 Customer Journey 15:34 ML Training: The Manual Process 19:34 ML Training: The Automated Process 21:22 The AutoML Library 23:06 AutoML: Preprocessing pipeline 27:12 AutoML: Modelling pipeline 30:15 AutoML: Classification Pipeline configuration 32:20 MLFlow: Introduction & Integration 34:16 MLFlow: Demo 37: 12 AutoML: Results on Titanic Kaggle Dataset 38:02 Problem 1: Intelligent Lead Scoring 38:47 Problem 1: AutoML Results on Lead Scoring problem 41:01 Problem 2: Intelligent Churn Prediction 41:29 Problem 3: AutoML Results on Churn Prediction problem 43:06 Conclusion 43:43 Contact Us 44:49 Q&A 1 47:12 Q&A 2 50:12 Q&A 3 53:36 Q&A 4

Q1. could we use this methodology in combination with NLP to automate a repository to sort article processings etc and fork other repositories for a research community? Q2. How is it to integrate some new every now and then we see for example pytorch tensorflow coming with new models how easy it is to integrate the model in automl and this is not out of the box it's not a standard model you need a bit of understanding what's going on or with graph neural networks for example thats it's not the typical conventional machine learning models there i guess there is some work need to be done to incorporate these new models? Q3. In one slide I saw that before training your model you may be doing a dimensionality reduction or standardize your data before training around your model. Is there any chance and despite of gaining more performance scores and better performance scores you may lose interpretability into your result?