ploomber / soorgeon

Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊
https://ploomber.io
Apache License 2.0
78 stars 20 forks source link

soorgeon cut #48

Open edublancas opened 2 years ago

edublancas commented 2 years ago

We should add one command to cut existing tasks into smaller pieces so users can take advantage of incremental builds.

soorgeon cut {task-name}

For example, if a user has a pipeline.yaml:

# pipeline.yaml
tasks:
  - source: script.py
     product: report.htm

We could run soorgeon refactor on script.py (but using the Python API directly), and break it into smaller pieces. Then manipulate the pipeline.yaml

# pipeline.yaml
tasks:
  - source: load.py
     product: load.html

  - source: clean.py
     product: clean.html

soorgeon already contains all the necessary elements to perform the refactoring so it's just a matter of experimenting a bit. The idea is that sometimes notebooks grow, so having an easy way to refactor them is great.