qiyunlab / HGTector

HGTector2: Genome-wide prediction of horizontal gene transfer based on distribution of sequence homology patterns.
BSD 3-Clause "New" or "Revised" License
131 stars 35 forks source link

ValueError: diamond failed with error code 1. #114

Closed taotaoyuan closed 1 year ago

taotaoyuan commented 1 year ago

Hi Qiyun, I am using the HGTector2 pipeline to investigate the HGT event of my genome pep.fa. I set up my local diamond database using :hgtector database -o database2 --default; I seem that everything is ok. but when I use the search function, the screen always traceback me a messages showed as follow: /home/lx_sky6/software/HGTector/scripts/hgtector search -i Rjup_protein.faa -o ./Rjup_protein -d /home/lx_sky6/software/HGTector/database2/diamond/db -p 30 -t /home/lx_sky6/software/HGTector/database2/taxdump Homology search started at 2023-04-03 14:11:52.743636. Settings: Search method: diamond. Self-alignment method: native. Remote fetch enabled: no. Reading input proteins... Rjup_protein: 39509 proteins. Done. Read 39509 proteins from 1 samples. Dropping sequences shorter than 30 aa... done. Reading local taxonomy database... done. Read 52713 taxa. Batch homology search of Rjup_protein started at 2023-04-03 14:11:53.170709. Number of queries: 39509. Traceback (most recent call last): File "/home/lx_sky6/software/HGTector/scripts/hgtector", line 96, in main() File "/home/lx_sky6/software/HGTector/scripts/hgtector", line 35, in main module(args) File "/home/lxsky6/software/miniconda3/envs/hgtector/lib/python3.11/site-packages/hgtector-2.0b3-py3.11.egg/hgtector/search.py", line 185, in call for id, score in self.selfaln_wf(seqs2a, res).items(): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lx_sky6/software/miniconda3/envs/hgtector/lib/python3.11/site-packages/hgtector-2.0b3-py3.11.egg/hgtector/search.py", line 941, in selfalnwf res = self.diamond_selfaln(batch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/lx_sky6/software/miniconda3/envs/hgtector/lib/python3.11/site-packages/hgtector-2.0b3-py3.11.egg/hgtector/search.py", line 2115, in diamond_selfaln raise ValueError(f'diamond failed with error code {ec}.') ValueError: diamond failed with error code 1.

what's wrong with this? waiting for your repy. thank you very much Regards

XiaomengWang0413 commented 1 year ago

I had the same problem. How does the author solve this problem?

qiyunzhu commented 1 year ago

@XiaomengWang0413 @taotaoyuan Thanks for reporting this. I received several similar reports recently. I think that there must be a compatibility issues with recent DIAMOND versions and HGTector2. I will look into this issue asap and get it fixed. Stay tuned!

XiaomengWang0413 commented 1 year ago

Hi Qiyun,When will it be ready @qiyunzhu

Xinpeng021001 commented 1 year ago

I also met same error

neoLIZV commented 1 year ago

@Xinpeng021001 , @XiaomengWang0413 , @taotaoyuan , @qiyunzhu

Hello all, I have encountered exactly the same problem and I have found a solution, which has been posted on my repository: https://github.com/neoLIZV/neoHGT, a modified version of HGTector.

Basically, instead of finding out the bug of search.py, I used manual search based on pre-computed result (detailed instructions on https://github.com/neoLIZV/neoHGT/#search).

Cheers.

taotaoyuan commented 1 year ago

Hello, I still encountered this error when I used the new version you provided.

Analysis started at 2023-05-04 13:28:50.034604. Reading local taxonomy database... Done. Read 52713 taxa. Reading homology search results... Traceback (most recent call last):   File "/home/lx_sky6/yt/soft/neoHGT/scripts/neoHGT", line 88, in <module>     main()   File "/home/lx_sky6/yt/soft/neoHGT/scripts/neoHGT", line 27, in main     module(args)   File "/home/lx_sky6/software/miniconda3/envs/hgtector/lib/python3.11/site-packages/neoHGT-2.0b3-py3.11.egg/neoHGT/analyze.py", line 143, in call     self.read_input()   File "/home/lx_sky6/software/miniconda3/envs/hgtector/lib/python3.11/site-packages/neoHGT-2.0b3-py3.11.egg/neoHGT/analyze.py", line 270, in read_input     self.data[sid] = self.read_search_results(                      ^^^^^^^^^^^^^^^^^^^^^^^^^   File "/home/lx_sky6/software/miniconda3/envs/hgtector/lib/python3.11/site-packages/neoHGT-2.0b3-py3.11.egg/neoHGT/analyze.py", line 316, in read_search_results     data[-1]['hits'].append(line)     ~~~~^^^^ IndexError: list index out of range

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年4月28日(星期五) 上午7:06 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

@Xinpeng021001 , @XiaomengWang0413 , @taotaoyuan , @qiyunzhu

Hello all, I have encountered exactly the same problem and I have found a solution, which has been posted on my repository: https://github.com/neoLIZV/neoHGT, a modified version of HGTector.

Basically, instead of finding out the bug of search.py, I used manual search based on pre-computed result (detailed instructions on https://github.com/neoLIZV/neoHGT/#search).

Known limitation: neoHGT can only read one sequential read, please don't input a list of files.

This is because I encountered a bug where a key error was raised on "self.pcmap[sid]" res = self.search_wf( batch, self.pcmap[sid] if self.method == 'precomp' else None)
so I changed the code into: res = self.search_wf( batch, list(self.pcmap.values())[0] if self.method == 'precomp' else None)
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Xinpeng021001 commented 1 year ago

Hi,

I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT.

taotaoyuan commented 1 year ago

hello, Mr. I tried several times but could not connect to NCBI. what should I do.

ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help)

$rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

Hi,

I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Xinpeng021001 commented 1 year ago

hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine.

taotaoyuan commented 1 year ago

helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond?

Thank you for your reply.

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Xinpeng021001 commented 1 year ago

helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.>

You could use "cat" to add all faa files together after unzipping, "cat *.faa > db.faa" . Later you could use diamond makedb to make diamond database.

diamond makedb --in db.faa -d nr db.

taotaoyuan commented 1 year ago

There are too many files, over 200,000, for cat to do.

-bash: /usr/bin/cat: 参数列表过长

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:29 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.**>

You could use "cat" to add all faa files together after unzipping, "cat *.faa > db.faa" . Later you could use diamond makedb to make diamond database.

diamond makedb --in db.faa -d nr db.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Xinpeng021001 commented 1 year ago

There are too many files, over 200,000, for cat to do. -bash: /usr/bin/cat: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:29 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> You could use "cat" to add all faa files together after unzipping, "cat .faa > db.faa" . Later you could use diamond makedb to make diamond database. diamond makedb --in db.faa -d nr db. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.>

You could split them into several folders and do it separately. For example, for i in $(seq 1 100) do mkdir ../$i mv ls . | head -1000 ../$i cat ../$i/* > ../$i/$i.db.faa done

And then cat all .db.faa together cat `find . -name .db.faa` > db.faa

taotaoyuan commented 1 year ago

Divided into 300 files are still not possible, is there any other way.

for i in $(seq 1 300);  do ls . | head -1000 | mv * ../$i;done

-bash: /usr/bin/mv: 参数列表过长

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:44 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

There are too many files, over 200,000, for cat to do. … -bash: /usr/bin/cat: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:29 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> You could use "cat" to add all faa files together after unzipping, "cat .faa > db.faa" . Later you could use diamond makedb to make diamond database. diamond makedb --in db.faa -d nr db. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.>

You could split them into several folders and do it separately. For example, for i in $(seq 1 100) do mkdir ../$i mv ls . | head -1000 ../$i cat ../$i/* > ../$i/$i.db.faa done

And then cat all .db.faa together cat find . -name .db.faa > db.faa

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

Xinpeng021001 commented 1 year ago

Divided into 300 files are still not possible, is there any other way. for i in $(seq 1 300);  do ls . | head -1000 | mv * ../$i;done -bash: /usr/bin/mv: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:44 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) There are too many files, over 200,000, for cat to do. … -bash: /usr/bin/cat: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:29 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 中午1:57 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> You could use "cat" to add all faa files together after unzipping, "cat .faa > db.faa" . Later you could use diamond makedb to make diamond database. diamond makedb --in db.faa -d nr db. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> You could split them into several folders and do it separately. For example, for i in $(seq 1 100) do mkdir ../$i mv ls . | head -1000 ../$i cat ../$i/ > ../$i/$i.db.faa done And then cat all .db.faa together cat find . -name *.db.faa > db.faa — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

could try more folders and less files in one folder, then repeat those steps.

taotaoyuan commented 1 year ago

I would like to know why the db.faa file is not produced in this step?

''nohup neoHGT database -c bacteria,archaea,fungi,viral -o neoHGT_database &''

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午12:21 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

Divided into 300 files are still not possible, is there any other way. for i in $(seq 1 300);  do ls . | head -1000 | mv ../$i;done … -bash: /usr/bin/mv: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:44 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) There are too many files, over 200,000, for cat to do. … -bash: /usr/bin/cat: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:29 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.&gt;; 发送时间: 2023年5月4日(星期四) 中午1:57 @.&gt;; 抄送: "Ti @.@.&gt;; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.&gt; Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.&gt; You could use "cat" to add all faa files together after unzipping, "cat .faa &gt; db.faa" . Later you could use diamond makedb to make diamond database. diamond makedb --in db.faa -d nr db. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.&gt; You could split them into several folders and do it separately. For example, for i in $(seq 1 100) do mkdir ../$i mv ls . | head -1000 ../$i cat ../$i/ > ../$i/$i.db.faa done And then cat all .db.faa together cat find . -name .db.faa > db.faa — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.*>

could try more folders and less files in one folder, then repeat those steps.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

taotaoyuan commented 1 year ago

Thank you, I solved this problem using the following code:

$ ls faa/ >list $ for id in $(cat list); do gunzip ./faa/$id;done $ find /home/lx_sky6/software/HGTector/neoHGT_database/download/faa -name '*.faa' |xargs cat > db.faa

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午12:21 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

Divided into 300 files are still not possible, is there any other way. for i in $(seq 1 300);  do ls . | head -1000 | mv ../$i;done … -bash: /usr/bin/mv: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:44 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) There are too many files, over 200,000, for cat to do. … -bash: /usr/bin/cat: 参数列表过长 ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月16日(星期二) 中午11:29 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) helo, Mr: I downloaded all the faa.gz files, but I didn't end up with the db.faa (A single multi-Fasta file containing all protein sequences) file. How can I use all the faa.gz files in the current local faa folder to generate db.faa and then build the library with diamond? Thank you for your reply. … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月4日(星期四) 晚上10:05 @.>; 抄送: "Ti @.@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) hello, Mr. I tried several times but could not connect to NCBI. what should I do. ValueError: <U+1F60C> No worries. This error ("bacteria" is not a valid RefSeq genome category) occurs either when you typed a category with a wrong name (for example virus (invalid) instead of viral) or when the NCBI server has temporarily shuted-down the connection from you (in which case a cup of coffee ☕️ will help) $rsync --list-only --no-motd rsync://ftp.ncbi.nlm.nih.gov/genomes/refseq/drwxr-sr-x 4096 2023/05/01 00:04:49 . lrwxrwxrwx 16 2023/05/04 00:15:47 README.txt lrwxrwxrwx 50 2023/05/04 00:15:59 assembly_summary_refseq.txt lrwxrwxrwx 61 2023/05/04 00:15:59 assembly_summary_refseq_historical.txt lrwxrwxrwx 37 2023/05/04 00:15:59 mitochondrion lrwxrwxrwx 31 2023/05/04 00:15:59 plasmid lrwxrwxrwx 31 2023/05/04 00:15:59 plastid drwxr-sr-x 90112 2023/05/04 10:53:01 archaea drwxr-sr-x 3612672 2023/05/04 05:05:11 bacteria drwxr-sr-x 40960 2023/05/03 12:38:16 fungi drwxr-sr-x 28672 2023/05/04 00:15:53 invertebrate drwxr-sr-x 4096 2023/02/07 13:44:34 metagenomes drwxr-sr-x 16384 2023/05/04 00:15:50 plant drwxr-sr-x 8192 2023/05/03 12:37:13 protozoa drwxr-sr-x 4096 2022/09/27 12:25:04 unknown drwxr-sr-x 20480 2023/05/03 12:37:20 vertebrate_mammalian drwxr-sr-x 28672 2023/05/03 12:37:13 vertebrate_other drwxr-sr-x 1044480 2023/05/04 13:00:54 viral … ------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.&gt;; 发送时间: 2023年5月4日(星期四) 中午1:57 @.&gt;; 抄送: "Ti @.@.&gt;; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114) Hi, I successfully run the neoHGT with pre-computed data. Maybe you could try to uninstall HGTector and redownload the neoHGT. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.&gt; Hi Yuan, I believe that the web connection of NBIC causes it. I recommend you directly download the database from HGTector (Ver 23.1). It's not the newest but could be fine. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.&gt; You could use "cat" to add all faa files together after unzipping, "cat .faa &gt; db.faa" . Later you could use diamond makedb to make diamond database. diamond makedb --in db.faa -d nr db. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.&gt; You could split them into several folders and do it separately. For example, for i in $(seq 1 100) do mkdir ../$i mv ls . | head -1000 ../$i cat ../$i/ > ../$i/$i.db.faa done And then cat all .db.faa together cat find . -name .db.faa > db.faa — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.*>

could try more folders and less files in one folder, then repeat those steps.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

qiyunzhu commented 1 year ago

Hello all @XiaomengWang0413 @taotaoyuan @Xinpeng021001 @neoLIZV Sorry for the long waiting. I finally had a chance to fix this bug (#119 ). It was a DIAMOND compatibility issue. Newer versions of DIAMOND disabled a parameter HGTector exploited. Now it was fixed and should work smoothly. Please update HGTector with:

pip install --force-reinstall --no-cache-dir git+https://github.com/qiyunlab/HGTector.git

Please let me know if it works for you!

taotaoyuan commented 1 year ago

hello;

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月25日(星期四) 凌晨0:24 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

Hello all @XiaomengWang0413 @taotaoyuan @Xinpeng021001 @neoLIZV Sorry for the long waiting. I finally had a chance to fix this bug (#119 ). It was a DIAMOND compatibility issue. Newer versions of DIAMOND disabled a parameter HGTector exploited. Now it was fixed and should work smoothly. Please update HGTector with: pip install --force-reinstall --no-cache-dir git+https://github.com/qiyunlab/HGTector.git

Please let me know if it works for you!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

taotaoyuan commented 1 year ago

helo;

I ran the software successfully, but got this result:  WARNING: No hit is assigned to distal group. Cannot predict HGTs.

Is there a problem there or is there no transferred gene in my genome itself.

Thanks.

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年5月25日(星期四) 凌晨0:24 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

Hello all @XiaomengWang0413 @taotaoyuan @Xinpeng021001 @neoLIZV Sorry for the long waiting. I finally had a chance to fix this bug (#119 ). It was a DIAMOND compatibility issue. Newer versions of DIAMOND disabled a parameter HGTector exploited. Now it was fixed and should work smoothly. Please update HGTector with: pip install --force-reinstall --no-cache-dir git+https://github.com/qiyunlab/HGTector.git

Please let me know if it works for you!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

qiyunzhu commented 1 year ago

@taotaoyuan This is probably because the taxonomic groups automatically assigned by the program are not the best for your case. You will need to modify two parameters: --self-tax and --close-tax of the command hgtector analyze to get reasonable results.

Since this is a separate question, let me close the original issue.

taotaoyuan commented 1 year ago

hello, Mr.

When I run analyze, I got this error.

neoHGT analyze -i search_out/GCF_000390285.2_Agla_2.0_protein.tsv -t /home/lx_sky6/software/HGTector/neoHGT_database/download/taxdump --self-tax 1871911 --close-tax 242839  -o analyze_out

  File "/home/lx_sky6/software/miniconda3/envs/neoHGT/lib/python3.11/site-packages/neoHGT-2.0b3-py3.11.egg/neoHGT/util.py", line 429, in _get_taxon

    raise ValueError(f'TaxID {tid} is not found in taxonomy database.')

ValueError: TaxID 1871911 is not found in taxonomy database.

Here are the results of my query from ncbi.

Rhodiola juparensis Taxonomy ID: 1871911 (for references in articles please use NCBI:txid1871911)

------------------ 原始邮件 ------------------ 发件人: "qiyunlab/HGTector" @.>; 发送时间: 2023年6月7日(星期三) 上午7:26 @.>; 抄送: "Ti @.**@.>; 主题: Re: [qiyunlab/HGTector] ValueError: diamond failed with error code 1. (Issue #114)

@taotaoyuan This is probably because the taxonomic groups automatically assigned by the program are not the best for your case. You will need to modify two parameters: --self-tax and --close-tax of the command hgtector analyze to get reasonable results.

Since this is a separate question, let me close the original issue.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>