NAL-i5K / Organism_Onboarding

A workflow to make organism onboarding pipeline easy to handle as an I/O pipeline
4 stars 1 forks source link

discuss workflow for processing NCBI Refseq annotations for chado ingest #88

Closed mpoelchau closed 2 years ago

mpoelchau commented 4 years ago

We need to set up (yet another) workflow to process gene sets from NCBI's eukaryotic annotation pipeline for Chado import. This will be a standalone workflow (to happen after final_workflow.cwl, and before the 'setup production' workflow). We may eventually pipe all these together, but not for now.

The workflow is currently a perl wrapper around some custom perl scripts. We can discuss re-writing these perl scripts as python, but since they are still in development I think perl is still preferred. https://gitlab.com/i5k_Workspace/monicas-data-processing-scripts/blob/master/process_NCBI_RefSeq_annotations_for_chado_ingest.pl

I'm also willing to discuss whether this should be a cwl workflow at all at this point - since the scripts are still under development, perhaps cwl will be overkill at this point.

mpoelchau commented 2 years ago

This is off the table now.