databio / peppro

A modular, containerized pipeline for PRO-seq data processing
http://peppro.databio.org/
BSD 2-Clause "Simplified" License
10 stars 2 forks source link

new way to check assets #45

Closed nsheff closed 5 years ago

nsheff commented 5 years ago

Can you see if this is OK? I want to get rid of all this code duplication.

One thing I didn't understand though is the difference between "not existing" and "missing from REFGENIE"... it seems like if it doesn't exist it would also be missing from refgenie.

nsheff commented 5 years ago

Also this will prevent these repeated messages, which was the reason I wanted to do this in the first place:


The 'refgene_tss' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --TSS-name.

The 'ensembl_tss' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --pi-tss.

The 'ensembl_gene_body' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --pi-body.

The 'refgene_pre_mRNA' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --pre-name.

The 'feat_annotation' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --anno-name.

The 'refgene_exon' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --exon-name.

The 'refgene_intron' asset does not exist.
Update your REFGENIE config file to include this asset, or point directly to the file using --intron-name.
nsheff commented 5 years ago

the new way looks like this:

Some assets are not found. You can update your REFGENIE config file or point directly to the file using the noted command-line arguments:
  Assets not existing: refgene_tss (--TSS-name), ensembl_tss (--pi-tss), ensembl_gene_body (--pi-body), refgene_pre_mRNA (--pre-name), feat_annotation (--anno-name), refgene_exon (--exon-name), refgene_intron (--intron-name)
nsheff commented 5 years ago

also can we clarify if/that these are optional? It's a little confusing because the pipeline completed anyway despite all these errors.

jpsmith5 commented 5 years ago

To make sure I follow here; you've already adjusted it to report any missing annotation files as a single message listing all missing items and their requisite command-line arguments yes?

If that's true, yes I like that and agree on the improvement.

So, I haven't made those things required where they are primarily involved in QC measures. So very useful information, but I wasn't clear how much those should be required since they wouldn't affect the files someone would use downstream (i.e. signal tracks).

nsheff commented 5 years ago

To make sure I follow here; you've already adjusted it to report any missing annotation files as a single message listing all missing items and their requisite command-line arguments yes?

yes.