XiaoTaoWang / EagleC

A deep-learning framework for predicting a full range of structural variations from bulk and single-cell contact maps
Other
52 stars 8 forks source link

‘Failed to find sqlite3 column type for category’ while annotate-gene-fusion #43

Open Wong718 opened 2 months ago

Wong718 commented 2 months ago

Hello, Xiaotao, It's a wonderful tool for SV detection. Here is an issue when applying the annotate-gene-fusion step.

The code I wrote:

annotate-gene-fusion --sv-file HS_E_24_ESCC.CNN_SVs.5K_combined.txt --output-file HS_E_24_ESCC.gene-fusions.txt --buff-size 10000 --skip-rows 1 --ensembl-release 93 --species human

The error output:

INFO:pyensembl.database:Creating database: /dshare/home/wangzj/.cache/pyensembl/GRCh38/ensembl93/Homo_sapiens.GRCh38.93.gtf.db
INFO:pyensembl.database:Reading GTF from /dshare/home/wangzj/.cache/pyensembl/GRCh38/ensembl93/Homo_sapiens.GRCh38.93.gtf.gz
INFO:root:Extracted GTF attributes: ['gene_id', 'gene_version', 'gene_name', 'gene_source', 'gene_biotype', 'transcript_id', 'transcript_version', 'transcript_name', 'transcript_source', 'transcript_biotype', 'tag', 'transcript_support_level', 'exon_number', 'exon_id', 'exon_version', 'protein_id', 'protein_version', 'ccds_id']
Traceback (most recent call last):
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/bin/annotate-gene-fusion", line 97, in <module>
    run()
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/bin/annotate-gene-fusion", line 61, in run
    db.index()
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/pyensembl/genome.py", line 280, in index
    self.db.connect_or_create(overwrite=overwrite)
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/pyensembl/database.py", line 286, in connect_or_create
    return self.create(overwrite=overwrite)
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/pyensembl/database.py", line 239, in create
    self._connection = datacache.db_from_dataframes_with_absolute_path(
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/datacache/database_helpers.py", line 176, in db_from_dataframes_with_absolute_path
    tables = build_tables(
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/datacache/database_helpers.py", line 136, in build_tables
    table = DatabaseTable.from_dataframe(
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/datacache/database_table.py", line 56, in from_dataframe
    column_db_type = db_type(values.dtype)
  File "/dshare/home/wangzj/env_set/miniconda3/envs/EagleC/lib/python3.8/site-packages/datacache/database_types.py", line 96, in db_type
    raise ValueError("Failed to find sqlite3 column type for %s" % (
ValueError: Failed to find sqlite3 column type for category

Can you help me fix this issue? Thanks a lot.

Krithika-Bhuvan commented 2 months ago

Upvoting this - got the same error today

XiaoTaoWang commented 2 months ago

Hi all, this error seems to be related to the installation of pyensembl or a version incompatibility. I’ll look into it and get back to you once I’ve figured it out.

jianglinghan commented 1 month ago

pip install numpy==1.23.5 pyensembl==2.3.13 should solve it, take a try. @Wong718 @Krithika-Bhuvan