jupyter-naas / naas

Low-code Python library to safely use notebooks in production: schedule workflows, generate assets, trigger webhooks, send notifications, build pipelines, manage secrets (Cloud-only)
https://app.naas.ai/
GNU Affero General Public License v3.0
283 stars 25 forks source link

feat: Add logic to be able to create pipelines on naas. #358

Closed Dr0p42 closed 1 year ago

Dr0p42 commented 1 year ago

This pull requests resolves: https://github.com/jupyter-naas/naas/issues/344

It's adding logic to easily create pipeline on naas.

It's that easy:

from naas.pipelines.pipelines import Pipeline, DummyStep, DummyErrorStep, NotebookStep, End, ParallelStep

pipeline = Pipeline()

success_notebook = NotebookStep('Success notebook', 'success_notebook.ipynb')

failing_notebook = NotebookStep('Failing notebook', 'failing_notebook.ipynb')

pipeline >> failing_notebook >> success_notebook >> End()

failing_notebook.on_error >> DummyStep('My Alert') >> DummyStep('Cleaning step')

pipeline.run()

This will give you the following

image

And after running it:

image

We can also run it by using pipeline.run('progress') which will show something like this:

image

In this example we have a notebook failing on purpose to trigger the on_error step.


Some other examples:

Imagine you have a bunch of notebooks to download videos from Youtube and compute a transcript out of it:

from naas.pipelines.pipelines import Pipeline, DummyStep, DummyErrorStep, NotebookStep, End, ParallelStep

pipeline = Pipeline()

download = DummyStep('Download Youtube Video')

download.on_error >> DummyStep('Alert youtube download error')

pipeline >> download >> DummyStep('Convert video to mp3') \
    >> DummyStep('Upload mp3 to s3') \
    >> DummyStep('Start Transcription') \
    >> DummyStep('Store transcript on S3') \
    >> [
        DummyStep('Notify Slack'),
        DummyStep('Notify Email')
    ] \
>> End()

pipeline

Here I am using DummyStep to give an idea, but it could definitely be NotebookSteps

image

Now imagine you want to fetch a bunch of ticker/stocks and do some computation with it and ultimately buy or sell. You could fetch the ticker at the same time and also extract sentiment from twitter about the companies.

from naas.pipelines.pipelines import Pipeline, DummyStep, DummyErrorStep, NotebookStep, End, ParallelStep

pipeline = Pipeline()

auto_buy_sell = DummyStep('Auto buy/sell stocks')

pipeline >> [
        DummyStep('Get Tesla Stock'),
        DummyStep('Get Google Stock'),
        DummyStep('Get Amazon Stock'),
        DummyStep('Get Microsoft Stock'),
        DummyStep('Get Meta Stock'),

        DummyStep('Get Tesla Sentiment from Twitter'),
        DummyStep('Get Google Sentiment from Twitter'),
        DummyStep('Get Amazon Sentiment from Twitter'),
        DummyStep('Get Microsoft Sentiment from Twitter'),
        DummyStep('Get Meta Sentiment from Twitter'),
    ] \
    >> DummyStep('Compute overall performance') \
    >> auto_buy_sell\
    >> DummyStep('Notify completion') \
>> End()

# On error on this step we may want to alert someone to take a manual action quickly.
auto_buy_sell.on_error >> [
    DummyStep('Alert Slack'),
    DummyStep('Alert SMS'),
    DummyStep('Alert Whatsapp'),
    DummyStep('Alert PagerDuty'),
    DummyStep('Alert Email'),
]

pipeline
image

I hope that with these examples you are getting the idea. For now we have a bunch of Steps:

But we could definitely have more types of steps like:

jravenel commented 1 year ago

@Dr0p42 seems like the PR is not passing, did you see?

sonarcloud[bot] commented 1 year ago

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 11 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

jravenel commented 1 year ago

Let's merge @Dr0p42 ? Sonarcloud is still not happy