NAL-i5K / general_issues

for issues and discussions not tied to a specific repository
2 stars 0 forks source link

New organism: Photinus pyralis #159

Closed mpoelchau closed 3 years ago

mpoelchau commented 3 years ago

NCBI RefSeq # for assembly: GCF_008802855.1

See https://gitlab.com/i5k_Workspace/workspace_roadmap/-/wikis/Adding-an-organism-CWL-update for full description of each task (requires gitlab login). We can use the genomics-workspace cwl workflow now, but it may need some refinement.

i5k-stage

Commands Loading from /usr/local/i5k/media/blast/db/ (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Photinus pyralis -t nucleotide Genome Assembly -f /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_genomic.fna -d Photinus pyralis genome assembly, Ppyr1.3 (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Photinus pyralis -t nucleotide Transcript -f /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_rna_from_genomic.fna -d Photinus pyralis NCBI Annotation release 100, transcripts (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Photinus pyralis -t peptide Protein -f /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_translated_cds.faa -d Photinus pyralis NCBI Annotation release 100, translated CDS (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Photinus pyralis -t nucleotide Transcript -f /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_cds_from_genomic.fna -d Photinus pyralis NCBI Annotation release 100, CDS

Make blast DBs and populate. Only listing commands for the genome now (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_genomic.fna -m (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_genomic.fna -p Make visible (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py blast_shown /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_genomic.fna --shown true

You need quotes around the description otherwise spaces are ignored when populating hmmer_hmmerdb (i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addhmmer Photinus pyralis -f /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_genomic.fna -d "Photinus pyralis genome assembly, Ppyr 1.3"

Adding to jbrowse (i5k) [i5k@i5k-stage-node1 ~]$ time python manage.py addjbrowse /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_genomic.fna https://apollo.nal.usda.gov/apollo/Photinus%20pyralis/jbrowse/

i5k production

mpoelchau commented 3 years ago

Slight updates on the genomics-workspace commands used for prod (see also https://github.com/NAL-i5K/genomics-workspace/issues/321):

python manage.py addorganism Photinus pyralis

#Note - DON'T use the full path of the fasta file - just the file name. Run from within the manage.py directory. (Would be nice to make this less brittle - the jbrowse derives the link from the title, which is copied from the file path. Would make more sense to get it from the file path and strip the directory, right?)
python manage.py addblast Photinus pyralis -t nucleotide Genome Assembly -f GCF_008802855.1_Ppyr1.3_genomic.fna -d 'Photinus pyralis genome assembly, Ppyr1.3'

python manage.py addblast Photinus pyralis -t nucleotide Transcript -f GCF_008802855.1_Ppyr1.3_rna_from_genomic-idupdate.fna -d Photinus pyralis NCBI Annotation release 100, transcripts

python manage.py addblast Photinus pyralis -t peptide Protein -f GCF_008802855.1_Ppyr1.3_translated_cds-idupdate.faa -d Photinus pyralis NCBI Annotation release 100, translated CDS 

python manage.py addblast Photinus pyralis -t nucleotide Transcript -f GCF_008802855.1_Ppyr1.3_cds_from_genomic-idupdate.fna -d Photinus pyralis NCBI Annotation release 100, CDS

#
python manage.py blast_utility GCF_008802855.1_Ppyr1.3_genomic.fna -m 
python manage.py blast_utility GCF_008802855.1_Ppyr1.3_genomic.fna -p
python manage.py blast_shown GCF_008802855.1_Ppyr1.3_genomic.fna --shown true
python manage.py addjbrowse GCF_008802855.1_Ppyr1.3_genomic.fna https://apollo.nal.usda.gov/apollo/Photinus_pyralis/jbrowse/

python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_rna_from_genomic-idupdate.fna -m 
python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_rna_from_genomic-idupdate.fna -p
python manage.py blast_shown /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_rna_from_genomic-idupdate.fna --shown true

python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_translated_cds-idupdate.faa -m 
python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_translated_cds-idupdate.faa -p
python manage.py blast_shown /usr/local/i5k/media/blast/db/GCF_008802855.1_Ppyr1.3_translated_cds-idupdate.faa --shown true

python manage.py blast_utility GCF_008802855.1_Ppyr1.3_cds_from_genomic-idupdate.fna -m 
python manage.py blast_utility GCF_008802855.1_Ppyr1.3_cds_from_genomic-idupdate.fna -p
python manage.py blast_shown GCF_008802855.1_Ppyr1.3_cds_from_genomic-idupdate.fna --shown true

#hmmer needs the protein fasta
python manage.py addhmmer Photinus pyralis -f GCF_008802855.1_Ppyr1.3_translated_cds-idupdate.faa -d 'Photinus pyralis NCBI Annotation release 100, translated CDS'