hoelzer / mgnify-lr

Evaluation of long-read support for the MGnify pipeline
GNU General Public License v3.0
6 stars 5 forks source link

output ENA manifest file for assembly upload #8

Closed hoelzer closed 4 years ago

hoelzer commented 4 years ago
STUDY   SRP189971_1586f07841cbf8f05e203684ee3b981b - this is an alias for the study
SAMPLE  SRS4558644 - ENA sample number
RUN_REF SRR8822472 - ENA run number
ASSEMBLYNAME    SRR8822472_964d7e942dcb3e9e47d268cb193b38f6 - this is an alias for the assembly
ASSEMBLY_TYPE   primary metagenome - assembly type. I think yours are primary metagnome as well, unless you performed any binning/dereplication on the file to be uploaded.
COVERAGE        85
PROGRAM metaspadesv3.13.0 - assembler 
PLATFORM        Illumina HiSeq 2500 - sequencing platform 
FASTA   /hps/nobackup2/production/metagenomics/results/assemblies/SRP1899/SRP189971/SRR8822/SRR8822472/metaspades/001/SRR8822472.fasta.gz - full path to assembly
DESCRIPTION    some description
hoelzer commented 4 years ago

FYI we have a (somewhat messy atm) script that generates these files for us. It is here if you want to look at how we generate aliases (we use combined md5s), but the alias can be anything you want - it is just temporary. /nfs/production/metagenomics/production/production-scripts/TPA-uploads/manifest.py

hoelzer commented 4 years ago
projectAliasCode = catmd5(projectID, runfilesPath, outPath)