Closed mpoelchau closed 3 years ago
Works fine on stage but getting this error on prod for the genome assembly. All analysis files were processed fine
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_utility /usr/local/i5k/media/blast/db/GCF_018152535.1_ASM1815253v1_genomic.fna -m
Traceback (most recent call last):
File "manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/usr/local/i5k/lib/python3.6/site-packages/django/core/management/__init__.py", line 381, in execute_from_command_line
utility.execute()
File "/usr/local/i5k/lib/python3.6/site-packages/django/core/management/__init__.py", line 375, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/local/i5k/lib/python3.6/site-packages/django/core/management/base.py", line 323, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/local/i5k/lib/python3.6/site-packages/django/core/management/base.py", line 364, in execute
output = self.handle(*args, **options)
File "/app/local/i5k/blast/management/commands/blast_utility.py", line 18, in handle
blast = BlastDb.objects.get(title = title)
File "/usr/local/i5k/lib/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/usr/local/i5k/lib/python3.6/site-packages/django/db/models/query.py", line 408, in get
self.model._meta.object_name
blast.models.DoesNotExist: BlastDb matching query does not exist.
@suryasaha looks like the genome fasta file is not loaded, you can probably just redo that command, or let me know if there's a problem with the load. (Per the droeug issue just use the name for the fasta file, not the full path):
django=# select app_organism.short_name,blast_blastdb.id,title from blast_blastdb inner join app_organism on ( blast_blastdb.organism_id = app_organism.id ) where short_name = 'drokik';
short_name | id | title
------------+-----+---------------------------------------------------
drokik | 518 | GCF_018152535.1_ASM1815253v1_cds_from_genomic.fna
drokik | 516 | GCF_018152535.1_ASM1815253v1_rna_from_genomic.fna
drokik | 517 | GCF_018152535.1_ASM1815253v1_translated_cds.faa
drokik | 33 | Dkik02082011-genome.fa
drokik | 278 | GCF_000224215.1_Dkik_2.0_rna.fna
drokik | 270 | GCF_000224215.1_Dkik_2.0_genomic.fna
drokik | 286 | GCF_000224215.1_Dkik_2.0_rna_from_genomic.fna
drokik | 302 | GCF_000224215.1_Dkik_2.0_protein.faa
drokik | 294 | GCF_000224215.1_Dkik_2.0_cds_from_genomic.fna
drokik | 34 | DKIK.fna
drokik | 35 | DKIK.faa
Loading genome without full path as suggested
(i5k) [i5k@i5k-node1 ~]$ python manage.py addblast Drosophila kikkawai -t nucleotide Genome Assembly -f GCF_018152535.1_ASM1815253v1_genomic.fna -d 'Drosophila kikkawai genome assembly, ASM1815383v1'
you can move to makeblastdb and populate sequence step
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_utility GCF_018152535.1_ASM1815253v1_genomic.fna -m
1 species finished
all done
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_utility GCF_018152535.1_ASM1815253v1_genomic.fna -p
1 species finished
all done
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_shown GCF_018152535.1_ASM1815253v1_genomic.fna --shown true
1 species finished
all done
Commands work but I don't see the new data sets on prod
The genome assembly now shows up. It looks like the annotation fastas aren't set to 'is shown'. I'd guess you should run those commands again.
django=# select app_organism.short_name,blast_blastdb.id,title,is_shown from blast_blastdb inner join app_organism on ( blast_blastdb.organism_id = app_organism.id ) where short_name = 'drokik';
short_name | id | title | is_shown
------------+-----+---------------------------------------------------+----------
drokik | 519 | GCF_018152535.1_ASM1815253v1_genomic.fna | t
drokik | 518 | GCF_018152535.1_ASM1815253v1_cds_from_genomic.fna | f
drokik | 516 | GCF_018152535.1_ASM1815253v1_rna_from_genomic.fna | f
drokik | 517 | GCF_018152535.1_ASM1815253v1_translated_cds.faa | f
drokik | 33 | Dkik02082011-genome.fa | f
drokik | 278 | GCF_000224215.1_Dkik_2.0_rna.fna | t
drokik | 270 | GCF_000224215.1_Dkik_2.0_genomic.fna | t
drokik | 286 | GCF_000224215.1_Dkik_2.0_rna_from_genomic.fna | t
drokik | 302 | GCF_000224215.1_Dkik_2.0_protein.faa | t
drokik | 294 | GCF_000224215.1_Dkik_2.0_cds_from_genomic.fna | t
drokik | 34 | DKIK.fna | t
drokik | 35 | DKIK.faa | t
(12 rows)
Loading genome without full path as suggested
(i5k) [i5k@i5k-node1 ~]$ python manage.py addblast Drosophila kikkawai -t nucleotide Genome Assembly -f GCF_018152535.1_ASM1815253v1_genomic.fna -d 'Drosophila kikkawai genome assembly, ASM1815383v1'
you can move to makeblastdb and populate sequence step
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_utility GCF_018152535.1_ASM1815253v1_genomic.fna -m
1 species finished
all done
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_utility GCF_018152535.1_ASM1815253v1_genomic.fna -p
1 species finished
all done
(i5k) [i5k@i5k-node1 ~]$ python manage.py blast_shown GCF_018152535.1_ASM1815253v1_genomic.fna --shown true
1 species finished
all done
Commands work but I don't see the new data sets on prod
@suryasaha looks like the description (which is what shows up in the app UI) is incorrect (note I'm just showing the query results from the genome fasta below for clarity):
django=> select app_organism.short_name,blast_blastdb.id,title,blast_blastdb.description from blast_blastdb inner join app_organism on ( blast_blastdb.organism_id = app_organism.id ) where short_name = 'drokik';
short_name | id | title | description
------------+-----+---------------------------------------------------+----------------------------------------------------------------------
------------------
drokik | 519 | GCF_018152535.1_ASM1815253v1_genomic.fna | Drosophila kikkawai genome assembly, ASM1815383v1
...
I will update the description in the database.
I updated the description for the genome for this organism on both stage and prod. @suryasaha can you please verify?
Haha.. so did I. Looks good now. Linkouts to jbrowse work fine too @mpoelchau
No copyright free image for organism page @mpoelchau This might work no clear CC license terms are listed https://bugguide.net/node/view/642813/bgpage
This is a fairly straightforward genome assembly update. See https://gitlab.com/i5k_Workspace/workspace_roadmap/-/wikis/Adding-an-organism-CWL-update for full description of each task (requires gitlab login)
assembly accession number: GCF_018152535.1
publication associated with assembly: https://doi.org/10.7554/eLife.66405
[x] Shelly: Run the final_workflow.cwl component of the organism_onboarding pipeline. THIS WILL NEED THE 'NO GAP' VERSION OF THE CONTAINER. https://gitlab.com/i5k_Workspace/workspace_roadmap/-/wikis/Adding-an-organism-CWL-update#1-run-the-final_workflowcwl-component-of-the-organism_onboarding-pipeline-httpsgithubcomnal-i5korganism_onboarding
[x] Shelly: Add gene page linkouts from Apollo. https://gitlab.com/i5k_Workspace/workspace_roadmap/-/wikis/Adding-an-organism-CWL-update#2-add-gene-page-linkouts-from-apollo
[x] Shelly: Run RNA-Seq pipeline. https://gitlab.com/i5k_Workspace/workspace_roadmap/-/wikis/Adding-RNA-Seq-data-to-an-i5k-Workspace-project-with-the-NAL_RNA_seq_annotation_pipeline
[x] Monica - run gene processing pipeline on Ceres for gene pages
Rest: Monica and Surya will divide up.