Open ffrank89 opened 8 months ago
url: https://www.youtube.com/watch?v=qUL7QabcKcw
1:33 - Introduction
2:01 - Before we start
(Link to the Jupyter blog referenced: https://blog.jupyter.org/ploomber-maintainable-and-collaborative-pipelines-in-jupyter-acb3ad2101a7)
3:28 - Problem: Jupyter Notebook code reviews are confusing
5:15 - Problem: It is difficult to collaborate over Jupyter Notebooks
6:43 - Solution: Scripts as notebooks (use jupytext to allow source code to be .py files instead of .ipynb)
8:03 - Solution: Modularization (break down Jupyter Notebook into multiply files with clear boundaries)
11:28 - Solution: Testing (Ploomber allows user to embed data quality tests)
12:55 - Solution: Reproducibility and Collaboration
13:25 - Demo
14:08 - Starting a new project
14:42 - pipeline.yaml explanation
18:05 - Demoing Scripts as notebooks
19:00 - ploomber plot
20:39 - Automatically created cells based on upstream dependencies and pipeline.yaml preferences
22:36 - Showing a different pipeline that has some logic
24:51 - How one would run their pipeline
26:20 - Demoing incremental builds feature (modularization)
28:31 - Demoing testing
29:20 - Cloud
30:01 - Conclusion
30:30 - Questions begin
30:36 - Q1: Can tasks be executed in parallel?
31:14 - Q2: How is this different from Elyra
33:28 - Q3: Is there a specific part of this [data science] workflow that you think Ploomber is better for?
36:19 - Q4: Is Ploomber a hobby or full time for you?
39:05 - Q5: Can the input be a Jupyter file or does it have to be a .py file?
40:06 - Q6: Do you imagine the input script could be a SQL script or something else in the future?
42:12 - Q7: Is there a way to specify software (Matlab, etc) in the pipeline?
43:00: Questions end
url: https://www.youtube.com/watch?v=qUL7QabcKcw
1:33 - Introduction
2:01 - Before we start
(Link to the Jupyter blog referenced: https://blog.jupyter.org/ploomber-maintainable-and-collaborative-pipelines-in-jupyter-acb3ad2101a7)
3:28 - Problem: Jupyter Notebook code reviews are confusing
5:15 - Problem: It is difficult to collaborate over Jupyter Notebooks
6:43 - Solution: Scripts as notebooks (use jupytext to allow source code to be .py files instead of .ipynb)
8:03 - Solution: Modularization (break down Jupyter Notebook into multiply files with clear boundaries)
11:28 - Solution: Testing (Ploomber allows user to embed data quality tests)
12:55 - Solution: Reproducibility and Collaboration
13:25 - Demo
14:08 - Starting a new project
14:42 - pipeline.yaml explanation
18:05 - Demoing Scripts as notebooks
19:00 - ploomber plot
20:39 - Automatically created cells based on upstream dependencies and pipeline.yaml preferences
22:36 - Showing a different pipeline that has some logic
24:51 - How one would run their pipeline
26:20 - Demoing incremental builds feature (modularization)
28:31 - Demoing testing
29:20 - Cloud
30:01 - Conclusion
30:30 - Questions begin
30:36 - Q1: Can tasks be executed in parallel?
31:14 - Q2: How is this different from Elyra
33:28 - Q3: Is there a specific part of this [data science] workflow that you think Ploomber is better for?
36:19 - Q4: Is Ploomber a hobby or full time for you?
39:05 - Q5: Can the input be a Jupyter file or does it have to be a .py file?
40:06 - Q6: Do you imagine the input script could be a SQL script or something else in the future?
42:12 - Q7: Is there a way to specify software (Matlab, etc) in the pipeline?
43:00: Questions end