ga4gh / fasp-scripts

Apache License 2.0
11 stars 7 forks source link

Review which scripts would make more sense as Jupyter notebooks #7

Closed ianfore closed 3 years ago

ianfore commented 3 years ago

Possibly all of them?

ianfore commented 3 years ago

Working on this. In general it seems to make sense that the kind of task we're trying to do in a FASPScript is best done in a notebook.

Tried this out and determined some necessary revisions to FASPRunner.

It also highlighted that the WES clients written to date are not sufficiently general purpose. That was never the intention for scripts that were intended for demonstration and proof of concept. It looks likely that to maintain generality a script writer would have to deal with formulating WES payloads in the script/notebook. A full WES Python client that fully exposes the WES model is conceivable and possibly useful. It's beyond the scope of what I want to attempt now.

Added runGenericWorkflow() to WESClient to enable running any WES workflow payload that can be formulated in a script.

Running from a notebook also affects logging of runs. FASPRunner keeps a local log of tasks that have been submitted to multiple WES servers. It records the name of the script being run. To date this has been gathered automatically via the inspect module. Notebooks are run via iPython. The name of the notebook being run doesn't seem to be available from inspect. Nor does there seem to be another way of getting it. The fallback is to add the program name manually and FASPRunner.init was modified to take this as a parameter. There is less guarantee of accuracy if the scripter does not keep that in sync with the name of the notebook.

With these changes FASPRunner can be configured and run effectively from a notebook. The working example in which this was validated is an adaptation of FASPScript10 as FASPNotebook10.

Keeping this issue open for now. Review of which scripts it makes sense to replace, or duplicate, as notebooks is on a back-burner.

One issue is that more complex scripts were not able to use FASPRunner in its current form. While FASPRunner can federate DRS calls via DRSMetaresolver, federating Search and WES calls is done within the script itself. Whether adding that capability to FASPRunner, or to leave it to the scripts/notebooks, is an open question.

ianfore commented 3 years ago

Work to date is part of the linked pull request.

Some additional notebooks have been added. A couple of disadvantages of notebooks

There are many other benefits to notebooks. Also others may know how to address both of the above.

ianfore commented 3 years ago

The pull request dealt with the basics, and is sufficient pre Jan 2021 Hackathon. There remain scripts which should be reviewed for retirement and/or migration to notebooks.