NAL-i5K / general_issues

for issues and discussions not tied to a specific repository
2 stars 0 forks source link

Genome assembly update: Ladona fulva #162

Closed mpoelchau closed 2 years ago

mpoelchau commented 3 years ago

Manual annotations

OGS remapping

Data management

Content deletion and addition steps

Data location:

mpoelchau commented 3 years ago

@suryasaha the localdata branch of the Organism_onboarding repo now works for files that aren't retrieved from a URL. I need to add documentation, but you can use the file final_workflow_short.yml as a template. It adds the full path for the genome, gff, and other fasta files.

mpoelchau commented 3 years ago

@suryasaha can you remind me where this is at?

suryasaha commented 2 years ago

Add fasta

(i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Ladona fulva -t nucleotide Genome Assembly -f /usr/local/i5k/media/blast/db/GCA_000376725.2_Lful_2.0_genomic.fna -d 'Ladona fulva genome assembly, Lful_2.0'
you can move to makeblastdb and populate sequence step
(i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Ladona fulva -t nucleotide Transcript -f /usr/local/i5k/media/blast/db/ladful_OGSv1.0_trans.fa -d 'Ladona fulva Official Gene Set ladful OGSv1.0, transcripts'
you can move to makeblastdb and populate sequence step
(i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Ladona fulva -t peptide Protein -f /usr/local/i5k/media/blast/db/ladful_OGSv1.0_pep.fa -d 'Ladona fulva Official Gene Set ladful OGSv1.0, translated CDS'
you can move to makeblastdb and populate sequence step
(i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addblast Ladona fulva -t peptide Protein -f /usr/local/i5k/media/blast/db/ladful_OGSv1.0_cds.fa -d 'Ladona fulva Official Gene Set ladful OGSv1.0, CDS'
you can move to makeblastdb and populate sequence step

Make blast DBs and populate

python manage.py blast_utility /usr/local/i5k/media/blast/db/GCA_000376725.2_Lful_2.0_genomic.fna -m
python manage.py blast_utility /usr/local/i5k/media/blast/db/GCA_000376725.2_Lful_2.0_genomic.fna -p

python manage.py blast_utility /usr/local/i5k/media/blast/db/ladful_OGSv1.0_trans.fa -m
python manage.py blast_utility /usr/local/i5k/media/blast/db/ladful_OGSv1.0_trans.fa -p

python manage.py blast_utility /usr/local/i5k/media/blast/db/ladful_OGSv1.0_pep.fa -m
python manage.py blast_utility /usr/local/i5k/media/blast/db/ladful_OGSv1.0_pep.fa -p

python manage.py blast_utility /usr/local/i5k/media/blast/db/ladful_OGSv1.0_cds.fa -m
python manage.py blast_utility /usr/local/i5k/media/blast/db/ladful_OGSv1.0_cds.fa -p

Make visible

python manage.py blast_shown /usr/local/i5k/media/blast/db/GCA_000376725.2_Lful_2.0_genomic.fna --shown true

python manage.py blast_shown /usr/local/i5k/media/blast/db/ladful_OGSv1.0_trans.fa --shown true

python manage.py blast_shown /usr/local/i5k/media/blast/db/ladful_OGSv1.0_pep.fa --shown true

python manage.py blast_shown /usr/local/i5k/media/blast/db/ladful_OGSv1.0_cds.fa --shown true

Add genome assembly fasta file to jbrowse

(i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addjbrowse /usr/local/i5k/media/blast/db/GCA_000376725.2_Lful_2.0_genomic.fna https://apollo.nal.usda.gov/apollo/Ladona%20fulva/jbrowse/
JBrowseSetting Successfully Created.

Set up hmmer for peptide fasta fila:

(i5k) [i5k@i5k-stage-node1 ~]$ python manage.py addhmmer Ladona fulva -f /usr/local/i5k/media/blast/db/ladful_OGSv1.0_pep.fa -d "Ladona fulva Official Gene Set ladful OGSv1.0, translated CDS"
Success
suryasaha commented 2 years ago

BLAST Protein - Ladona fulva Official Gene Set ladful OGSv1.0, CDS shoud be Nucleotide - Ladona fulva Official Gene Set ladful OGSv1.0, CDS

This needs a fix @mpoelchau Sorry for the error!

suryasaha commented 2 years ago

https://apollo2-stage-node1.nal.usda.gov/apollo/4517586/jbrowse/index.html

Congratulations, JBrowse is on the web!
However, JBrowse could not start, either because it has not yet been configured      and loaded with data, or because of an error.
Error message(s):
Failed to load resource: ColorByType/View/Track/ColorByTypeDraggable.
suryasaha commented 2 years ago

https://apollo.nal.usda.gov/apollo/jbrowse still has old genome

suryasaha commented 2 years ago

Does this need a new annotation page? Already have https://data.nal.usda.gov/dataset/ladona-fulva-official-gene-set-ladfulogsv10

Assembly analysis page https://i5k.nal.usda.gov/bio_data/954997

mpoelchau commented 2 years ago

Yes, it needs a new annotation page. I'll set it up.

mpoelchau commented 2 years ago

Annotation page done: https://i5k.nal.usda.gov/bio_data/954998

mpoelchau commented 2 years ago

@suryasaha I just removed Ladona from prod apollo - the new one is good to load.

suryasaha commented 2 years ago

New genome looks good https://apollo.nal.usda.gov/apollo/4457810/jbrowse/index.html?loc=APVN02018775.1%3A185632..278438&tracks=DNA%2CAnnotations%2CGC%20Content%2CGaps%20in%20assembly%2Cladful_current_models&highlight=

suryasaha commented 2 years ago

Everything should be ok except this GS error on both stage and prod https://github.com/NAL-i5K/general_issues/issues/162#issuecomment-931434842

mpoelchau commented 2 years ago

Fixed the CDS error for blast.

mpoelchau commented 2 years ago

@mpoelchau needs to:

mpoelchau commented 2 years ago

@suryasaha can you make the ladful directory on apollo2-stage group-writable?

suryasaha commented 2 years ago

Should be writable now

suryasaha commented 2 years ago

Tripal pages, data downloads and Genomics workspace looks fine. OGSv1.0 shows up in Apollo. No jbrowse linkouts from blast results @mpoelchau

mpoelchau commented 2 years ago

@suryasaha thanks, I added the blast linkout.

mpoelchau commented 2 years ago

Looks like the data downloads needs to be set up on stage - @suryasaha if you point me to the yml file I can handle it.

suryasaha commented 2 years ago

Thanks! The YML is in /app/data/surya.saha/Organism_Onboarding/jobs/ladful/ on apollo-stage