marbl / CHM13

The complete sequence of a human genome
Other
882 stars 96 forks source link

Problems in chm13v2.0_RefSeq_Liftoff_v5.1.gff3 #95

Open yeeus opened 2 months ago

yeeus commented 2 months ago

I have used liftoff using the annotation file of GRCh38.p14 and liftoff was running normally without any problems. But when I used chm13v2.0_RefSeq_Liftoff_v5.1.gff3 (after some tests, I found it was caused by chromosome Y), I got an error (the gff_file was chrY.gff):

>>> feature_db = gffutils.create_db(gff_file, gff_file + "_db", merge_strategy="create_unique", force=True, disable_infer_transcripts=True, disable_infer_genes=True, verbose=True)
2024-04-20 15:22:46,577 - INFO - Populating features
Traceback (most recent call last):t-order relations: 13000 features
  File "~/mambaforge/envs/liftoff/lib/python3.10/site-packages/gffutils/create.py", line 622, in _populate_from_lines
    self._insert(f, c)
  File "~/mambaforge/envs/liftoff/lib/python3.10/site-packages/gffutils/create.py", line 566, in _insert
    cursor.execute(constants._INSERT, feature.astuple())
sqlite3.IntegrityError: UNIQUE constraint failed: features.id

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "~/mambaforge/envs/liftoff/lib/python3.10/site-packages/gffutils/create.py", line 1401, in create_db
    c.create()
  File "~/mambaforge/envs/liftoff/lib/python3.10/site-packages/gffutils/create.py", line 543, in create
    self._populate_from_lines(self.iterator)
  File "~/mambaforge/envs/liftoff/lib/python3.10/site-packages/gffutils/create.py", line 656, in _populate_from_lines
    self._insert(f, c)
  File "~/mambaforge/envs/liftoff/lib/python3.10/site-packages/gffutils/create.py", line 566, in _insert
    cursor.execute(constants._INSERT, feature.astuple())
sqlite3.IntegrityError: UNIQUE constraint failed: features.id

Do you have some ideas?