TreesLab / CircMiMi

A package for constructing CLIP-seq data-supported circRNA-miRNA-mRNA interactions
MIT License
5 stars 1 forks source link

Problem in generating references #5

Open DrYuri1989 opened 1 year ago

DrYuri1989 commented 1 year ago

Hello, I'm interested in trying "CircMiMi" because my research focuses on the interactions of circRNAs. I attempted to generate the reference files, but I always encountered an error when adding them to the database. Could you please advise me on how to resolve this? Thanks.

(base) yu@yu-virtual-machine:~/Downloads/CircMiMi-master/circmimi$ circmimi_tools genref --species hsa --source gencode --version 34 refs/
2023-01-20 14:09:28,704 - NumExpr defaulting to 4 threads.
--2023-01-20 14:09:36--  ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_34/gencode.v34.annotation.gtf.gz
           => ‘./gencode.v34.annotation.gtf.gz’
Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.193.138
Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.193.138|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/databases/gencode/Gencode_human/release_34 ... done.
==> SIZE gencode.v34.annotation.gtf.gz ... 43164654
==> PASV ... done.    ==> RETR gencode.v34.annotation.gtf.gz ... done.
Length: 43164654 (41M) (unauthoritative)

gencode.v34.annotat 100%[===================>]  41.16M  3.23MB/s    in 13s     

2023-01-20 14:09:51 (3.16 MB/s) - ‘./gencode.v34.annotation.gtf.gz’ saved [43164654]

--2023-01-20 14:09:51--  ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_34/GRCh38.primary_assembly.genome.fa.gz
           => ‘./GRCh38.primary_assembly.genome.fa.gz’
Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.193.138
Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.193.138|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/databases/gencode/Gencode_human/release_34 ... done.
==> SIZE GRCh38.primary_assembly.genome.fa.gz ... 844691642
==> PASV ... done.    ==> RETR GRCh38.primary_assembly.genome.fa.gz ... done.
Length: 844691642 (806M) (unauthoritative)

GRCh38.primary_asse 100%[===================>] 805.56M  2.46MB/s    in 3m 51s  

2023-01-20 14:13:45 (3.49 MB/s) - ‘./GRCh38.primary_assembly.genome.fa.gz’ saved [844691642]

--2023-01-20 14:13:45--  https://www.mirbase.org/ftp/22/mature.fa.gz
Resolving www.mirbase.org (www.mirbase.org)... 130.88.97.249
Connecting to www.mirbase.org (www.mirbase.org)|130.88.97.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 803746 (785K) [application/x-gzip]
Saving to: ‘./mature.fa.gz’

mature.fa.gz        100%[===================>] 784.91K   293KB/s    in 2.7s    

2023-01-20 14:13:48 (293 KB/s) - ‘./mature.fa.gz’ saved [803746/803746]

--2023-01-20 14:13:48--  https://mirtarbase.cuhk.edu.cn/~miRTarBase/miRTarBase_2019/cache/download/7.0/miRTarBase_MTI.xlsx
Resolving mirtarbase.cuhk.edu.cn (mirtarbase.cuhk.edu.cn)... 116.31.95.60
Connecting to mirtarbase.cuhk.edu.cn (mirtarbase.cuhk.edu.cn)|116.31.95.60|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 26285698 (25M) [application/vnd.openxmlformats-officedocument.spreadsheetml.sheet]
Saving to: ‘./miRTarBase_MTI.xlsx’

miRTarBase_MTI.xlsx 100%[===================>]  25.07M  1.18MB/s    in 22s     

2023-01-20 14:14:11 (1.15 MB/s) - ‘./miRTarBase_MTI.xlsx’ saved [26285698/26285698]

--2023-01-20 14:14:11--  https://treeslab1.genomics.sinica.edu.tw/CircMiMi/refs/miRDB_data/v6.0/miRDB_v6.0_hsa.simple.tsv.gz
Resolving treeslab1.genomics.sinica.edu.tw (treeslab1.genomics.sinica.edu.tw)... 140.109.55.12
Connecting to treeslab1.genomics.sinica.edu.tw (treeslab1.genomics.sinica.edu.tw)|140.109.55.12|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6423753 (6.1M) [application/x-gzip]
Saving to: ‘./miRDB_v6.0_hsa.simple.tsv.gz’

miRDB_v6.0_hsa.simp 100%[===================>]   6.13M  1.59MB/s    in 3.8s    

2023-01-20 14:14:15 (1.59 MB/s) - ‘./miRDB_v6.0_hsa.simple.tsv.gz’ saved [6423753/6423753]

--2023-01-20 14:14:15--  https://treeslab1.genomics.sinica.edu.tw/CircMiMi/refs/ENCORI_miRNA/mir_target_ref.ENCORI.miRBase_v22.simple.tsv.gz
Resolving treeslab1.genomics.sinica.edu.tw (treeslab1.genomics.sinica.edu.tw)... 140.109.55.12
Connecting to treeslab1.genomics.sinica.edu.tw (treeslab1.genomics.sinica.edu.tw)|140.109.55.12|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4886391 (4.7M) [application/x-gzip]
Saving to: ‘./mir_target_ref.ENCORI.miRBase_v22.simple.tsv.gz’

mir_target_ref.ENCO 100%[===================>]   4.66M  1.16MB/s    in 4.0s    

2023-01-20 14:14:20 (1.16 MB/s) - ‘./mir_target_ref.ENCORI.miRBase_v22.simple.tsv.gz’ saved [4886391/4886391]

--2023-01-20 14:14:20--  ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_34/gencode.v34.pc_transcripts.fa.gz
           => ‘./gencode.v34.pc_transcripts.fa.gz’
Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.193.138
Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.193.138|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/databases/gencode/Gencode_human/release_34 ... done.
==> SIZE gencode.v34.pc_transcripts.fa.gz ... 43164068
==> PASV ... done.    ==> RETR gencode.v34.pc_transcripts.fa.gz ... done.
Length: 43164068 (41M) (unauthoritative)

gencode.v34.pc_tran 100%[===================>]  41.16M  3.69MB/s    in 13s     

2023-01-20 14:14:35 (3.10 MB/s) - ‘./gencode.v34.pc_transcripts.fa.gz’ saved [43164068]

--2023-01-20 14:14:35--  ftp://ftp.ebi.ac.uk/pub/databases/gencode/Gencode_human/release_34/gencode.v34.lncRNA_transcripts.fa.gz
           => ‘./gencode.v34.lncRNA_transcripts.fa.gz’
Resolving ftp.ebi.ac.uk (ftp.ebi.ac.uk)... 193.62.193.138
Connecting to ftp.ebi.ac.uk (ftp.ebi.ac.uk)|193.62.193.138|:21... connected.
Logging in as anonymous ... Logged in!
==> SYST ... done.    ==> PWD ... done.
==> TYPE I ... done.  ==> CWD (1) /pub/databases/gencode/Gencode_human/release_34 ... done.
==> SIZE gencode.v34.lncRNA_transcripts.fa.gz ... 14764808
==> PASV ... done.    ==> RETR gencode.v34.lncRNA_transcripts.fa.gz ... done.
Length: 14764808 (14M) (unauthoritative)

gencode.v34.lncRNA_ 100%[===================>]  14.08M  3.68MB/s    in 4.7s    

2023-01-20 14:14:42 (3.01 MB/s) - ‘./gencode.v34.lncRNA_transcripts.fa.gz’ saved [14764808]

--2023-01-20 14:14:42--  https://treeslab1.genomics.sinica.edu.tw/CircMiMi/refs/ENCORI_RBP/ENCORI_RBP_binding_sites.hg38.AGO.bed.gz
Resolving treeslab1.genomics.sinica.edu.tw (treeslab1.genomics.sinica.edu.tw)... 140.109.55.12
Connecting to treeslab1.genomics.sinica.edu.tw (treeslab1.genomics.sinica.edu.tw)|140.109.55.12|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 18687603 (18M) [application/x-gzip]
Saving to: ‘./ENCORI_RBP_binding_sites.hg38.AGO.bed.gz’

ENCORI_RBP_binding_ 100%[===================>]  17.82M  3.67MB/s    in 7.5s    

2023-01-20 14:14:50 (2.37 MB/s) - ‘./ENCORI_RBP_binding_sites.hg38.AGO.bed.gz’ saved [18687603/18687603]

--2023-01-20 14:19:22--  https://www.mirbase.org/ftp/21/miRNA.dat.gz
Resolving www.mirbase.org (www.mirbase.org)... 130.88.97.249
Connecting to www.mirbase.org (www.mirbase.org)|130.88.97.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3245620 (3.1M) [application/x-gzip]
Saving to: ‘./miRNA.dat.gz’

miRNA.dat.gz        100%[===================>]   3.09M   864KB/s    in 4.3s    

2023-01-20 14:19:27 (731 KB/s) - ‘./miRNA.dat.gz’ saved [3245620/3245620]

--2023-01-20 14:19:27--  https://www.mirbase.org/ftp/22/miRNA.dat.gz
Resolving www.mirbase.org (www.mirbase.org)... 130.88.97.249
Connecting to www.mirbase.org (www.mirbase.org)|130.88.97.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4115022 (3.9M) [application/x-gzip]
Saving to: ‘./miRNA.dat.gz’

miRNA.dat.gz        100%[===================>]   3.92M  1.10MB/s    in 4.4s    

2023-01-20 14:19:32 (906 KB/s) - ‘./miRNA.dat.gz’ saved [4115022/4115022]

--2023-01-20 14:19:32--  https://www.mirbase.org/ftp/22/miRNA.diff.gz
Resolving www.mirbase.org (www.mirbase.org)... 130.88.97.249
Connecting to www.mirbase.org (www.mirbase.org)|130.88.97.249|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 133886 (131K) [application/x-gzip]
Saving to: ‘./miRNA.diff.gz’

miRNA.diff.gz       100%[===================>] 130.75K   147KB/s    in 0.9s    

2023-01-20 14:19:34 (147 KB/s) - ‘./miRNA.diff.gz’ saved [133886/133886]

Killed
(base) yu@yu-virtual-machine:~/Downloads/CircMiMi-master/circmimi$ circmimi_tools genref --species hsa --source gencode --version 34 refs/
2023-01-20 14:20:40,995 - NumExpr defaulting to 4 threads.
Traceback (most recent call last):
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.IntegrityError: UNIQUE constraint failed: chromosome.name

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/yu/anaconda3/bin/circmimi_tools", line 8, in <module>
    sys.exit(cli())
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/scripts/circmimi_tools.py", line 164, in generate_references
    info, ref_files = genref.generate(species, source, version, ref_dir)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/reference/genref.py", line 311, in generate
    anno_ref.generate()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/reference/genref.py", line 26, in generate
    gendb.generate(self.src_name, self.filename)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/reference/gendb.py", line 196, in generate
    _write_data_to_db(session, tables_raw_data.chromosomes, Chromosome)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/reference/gendb.py", line 179, in _write_data_to_db
    session.commit()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1451, in commit
    self._transaction.commit(_to_root=self.future)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 829, in commit
    self._prepare_impl()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 808, in _prepare_impl
    self.session.flush()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3383, in flush
    self._flush(objects)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3523, in _flush
    transaction.rollback(_capture_exception=True)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/langhelpers.py", line 70, in __exit__
    compat.raise_(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 3483, in _flush
    flush_context.execute()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 456, in execute
    rec.execute(self)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/unitofwork.py", line 630, in execute
    util.preloaded.orm_persistence.save_obj(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 245, in save_obj
    _emit_insert_statements(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/persistence.py", line 1238, in _emit_insert_statements
    result = connection._execute_20(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1631, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 332, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
    self._handle_dbapi_exception(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
    util.raise_(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.IntegrityError: (sqlite3.IntegrityError) UNIQUE constraint failed: chromosome.name
[SQL: INSERT INTO chromosome (name) VALUES (?)]
[parameters: ('chr1',)]
(Background on this error at: https://sqlalche.me/e/14/gkpj)
(base) yu@yu-virtual-machine:~/Downloads/CircMiMi-master/circmimi$ circmimi_tools checking -r refs/ -i circRNAs.gencode_format.tsv -o circRNAs.gencode_format. --dist 5000
2023-01-20 14:24:13,174 - NumExpr defaulting to 4 threads.
Traceback (most recent call last):
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: chromosome

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/yu/anaconda3/bin/circmimi_tools", line 8, in <module>
    sys.exit(cli())
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/scripts/circmimi_tools.py", line 324, in default_checking
    ctx.invoke(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/scripts/circmimi_tools.py", line 208, in check_annotation
    circ_events.check_annotation(anno_db)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/circ.py", line 146, in check_annotation
    self._annotator = Annotator(anno_db_file)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/annotation.py", line 116, in __init__
    self._db = Annotation(anno_db_file)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/circmimi/annotation.py", line 17, in __init__
    self.session.query(Chromosome.name, Chromosome.id).all()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2768, in all
    return self._iter().all()
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 2903, in _iter
    result = self.session.execute(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1712, in execute
    result = conn._execute_20(statement, params or {}, execution_options)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1631, in _execute_20
    return meth(self, args_10style, kwargs_10style, execution_options)
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 332, in _execute_on_connection
    return connection._execute_clauseelement(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1498, in _execute_clauseelement
    ret = self._execute_context(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1862, in _execute_context
    self._handle_dbapi_exception(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2043, in _handle_dbapi_exception
    util.raise_(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 208, in raise_
    raise exception
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1819, in _execute_context
    self.dialect.do_execute(
  File "/home/yu/anaconda3/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 732, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: chromosome
[SQL: SELECT chromosome.name AS chromosome_name, chromosome.id AS chromosome_id 
FROM chromosome]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
chiangtw commented 1 year ago

Hi,

Please remove the file gencode.v34.annotation.db in your refs directory, and then try again.

tw