Clinical-Genomics / scout

VCF visualization interface
https://clinical-genomics.github.io/scout
BSD 3-Clause "New" or "Revised" License
152 stars 46 forks source link

KeyError: 'ensembl_transcript_id' when updating genes collection #2691

Closed northwestwitch closed 3 years ago

northwestwitch commented 3 years ago
Traceback (most recent call last):
  File "/home/hiseq.clinical/miniconda//envs/prod-stage/bin/scout", line 8, in <module>
    sys.exit(cli())
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/flask/cli.py", line 586, in main
    return super(FlaskGroup, self).main(*args, **kwargs)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/flask/cli.py", line 426, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/scout/commands/update/genes.py", line 106, in genes
    transcripts = load_transcripts(adapter, ensembl_transcripts, genome_build, ensembl_genes)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/scout/load/transcript.py", line 35, in load_transcripts
    transcripts_dict = parse_transcripts(transcripts_lines)
  File "/home/hiseq.clinical/miniconda/envs/prod-stage/lib/python3.6/site-packages/scout/parse/ensembl.py", line 119, in parse_transcripts
    tx_id = tx["ensembl_transcript_id"]
KeyError: 'ensembl_transcript_id'

Originally posted by @northwestwitch in https://github.com/Clinical-Genomics/scout/issues/2543#issuecomment-854149504

northwestwitch commented 3 years ago

Working on this but there is not much we can do at the moment. The url you get redirected to when downloading genes_to_phenotype.html (https://ci.monarchinitiative.org/view/hpo/job/hpo.annotations/lastSuccessfulBuild/artifact/rare-diseases/util/annotation/phenotype_to_genes.txt) is down.

I'll try later today

northwestwitch commented 3 years ago

Nope still down. I'll check if I can open an issue somewhere if they are not already aware of the problem

dnil commented 3 years ago

ci.monarchinitiative.org is up again!

northwestwitch commented 3 years ago

ci.monarchinitiative.org is up again!

FInally! I'll work on fixing this today then!

northwestwitch commented 3 years ago

Another error you can reproduce by running "scout --demo update genes" locally:

Traceback (most recent call last):
  File "/Users/chiararasi/miniconda3/envs/py38/bin/scout", line 33, in <module>
    sys.exit(load_entry_point('scout-browser', 'console_scripts', 'scout')())
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/flask/cli.py", line 596, in main
    return super().main(*args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/flask/cli.py", line 440, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/chiararasi/Documents/work/GITs/scout/scout/commands/update/genes.py", line 87, in genes
    hgnc_genes = load_hgnc_genes(
  File "/Users/chiararasi/Documents/work/GITs/scout/scout/load/hgnc_gene.py", line 155, in load_hgnc_genes
    adapter.load_hgnc_bulk(gene_objects)
  File "/Users/chiararasi/Documents/work/GITs/scout/scout/adapter/mongo/hgnc.py", line 40, in load_hgnc_bulk
    result = self.hgnc_collection.insert_many(gene_objs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/pymongo/collection.py", line 746, in insert_many
    raise TypeError("documents must be a non-empty list")
TypeError: documents must be a non-empty list
northwestwitch commented 3 years ago

Printing debugging messages helped me understand that the original error is caused by a database connection error on biomart:

2021-06-07 11:47:17 n129-p71.local scout.parse.ensembl[66101] ERROR Query ERROR: caught BioMart::Exception::Database: Could not connect to mysql database ensembl_mart_104: DBI connect('database=ensembl_mart_104;host=127.0.0.1;port=5316','ensro',...) failed: Can't connect to MySQL server on '127.0.0.1' (111) at /nfs/public/ro/ensweb/live/mart/www_104/biomart-perl/lib/BioMart/Configuration/DBLocation.pm line 98.
Traceback (most recent call last):
  File "/Users/chiararasi/miniconda3/envs/py38/bin/scout", line 33, in <module>
    sys.exit(load_entry_point('scout-browser', 'console_scripts', 'scout')())
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
    return self.main(*args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/flask/cli.py", line 596, in main
    return super().main(*args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1062, in main
    rv = self.invoke(ctx)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1668, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/flask/cli.py", line 440, in decorator
    return __ctx.invoke(f, *args, **kwargs)
  File "/Users/chiararasi/miniconda3/envs/py38/lib/python3.8/site-packages/click/core.py", line 763, in invoke
    return __callback(*args, **kwargs)
  File "/Users/chiararasi/Documents/work/GITs/scout/scout/commands/update/genes.py", line 157, in genes
    transcripts = load_transcripts(adapter, ensembl_transcripts, genome_build, ensembl_genes)
  File "/Users/chiararasi/Documents/work/GITs/scout/scout/load/transcript.py", line 35, in load_transcripts
    transcripts_dict = parse_transcripts(transcripts_lines)
  File "/Users/chiararasi/Documents/work/GITs/scout/scout/parse/ensembl.py", line 119, in parse_transcripts
    tx_id = tx["ensembl_transcript_id"]
KeyError: 'ensembl_transcript_id'