Closed nsheff closed 2 years ago
Actually we're already close with eido inspect -n
: http://eido.databio.org/en/latest/cli/
eido inspect pep_bio.yaml -n sample1
Sample 'sample1' in Project (/home/nsheff/code/incubator/learn_cwl/cwl-pep/bioinformatics_demo/pep_bio.yaml)
sample_name: sample1
protocol: RNA-seq
organism: human
read1: data/sample1_1.fq.gz
read2: data/sample1_2.fq.gz
Index: refgenie://t7/bwa_index
pipeline_interfaces: bwa_cwl_interface.yaml
InputFile1: data/sample1_1.fq.gz
InputFile2: data/sample1_2.fq.gz
genome: t7
ok, I think we should add this capability to the PEP framework then.
I'm just not sure if eido is the right place -- it would enable people to get the processed PEP in Python or on the command line. What about R? Maybe it would make sense to implement this in peppy and pepr and use peppy's Python API in eido to provide this via CLI?
Maybe it would make sense to implement this in peppy and pepr
Yes that's the alternative option. I think this is a standalone enough function...for now I'd rather only implement it once. It's like validation, we don't implement in R -- you'd use eido to validate. And in this case, the point is to filter and then use for something downstream, regardless of language, so there's no need to implement in 2 languages. you'd use the output as input via streams or files.
So, that argues for putting it into eido -- or in something else that's python-only outside of peppy. Or at least, just not in pepr.
Maybe this is a new pepfilters
package.
Some decisions:
Name: pepconvert
This is now implemented as eido convert
, with all the functionality envisioned in pepconvert
. It is awesome. But there are 2 limitations:
we could change the required plugin function signature from plugin(peppy.Project)
to plugin(peppy.Project, **kwargs)
and add an optional -a
/--args
argument to the eido convert
command. It could accept a string of this format: --args arg1=value1 arg2=value2
. We could parse that, stick in a dict
and unpack in the plugin()
call.
I'm happy with that approach.
ok, the kwargs support is implemented. Now we just need to update the filter functions to make use of this feature.
This feature is now relatively complete and functional in version 1.6.0 of eido, with release pending, so I'm closing this issue.
Today talking with some nf-core Nextflow developers, it came up that it would be useful to be able to output a processed PEP, either in CSV format or in yaml/json format.
So, think of it as a PEP (yaml+csv) -> YAML converter... it's kind of a "filter" that would read the PEP and output it in the other format. This is basically what looper does when it creates the sample yaml files, which can be modulated with looper plugins. The difference here I guess is that we don't need all the rest of the looper capability -- just the printing of sample yaml files, perhaps all in one file. We need just some command-line tool that would output the PEP in YAML format.
I think this might make sense to have as part of eido, since it already provides a command-line interface... And in fact, could go to the point of, maybe, extracting out the looper sample-writing capabilities to put into eido. In that case, the plugin system may actually be useful here.
@stolarczyk thoughts?