mtisza1 / Cenote-Taker2

Cenote-Taker2: Discover and Annotate Divergent Viral Contigs (Please use Cenote-Taker 3 instead)
MIT License
56 stars 7 forks source link

erro: Argument list too long when using latest cenote-Taker2.1.5 #37

Open hazel914 opened 1 year ago

hazel914 commented 1 year ago

Hi Mike, Thank you for Cenote-Taker2! However, I am facing a problem when i run the commond: python ~/miniconda3/bin/Cenote-Taker2/run_cenote-taker2.py -c ~/hxj/depth_analysis/raw_data_2/contig_megahit/SRR9161502_9000000/final.contigs.fa -r SRR9161502_9000000 -p true -m 32 -t 32 2>&1 | tee output.log

the tail of the output.log file is like: `SRR9161502_900000074949.fasta has DTRs/circularity SRR9161502_9000000541967.fasta has DTRs/circularity SRR9161502_9000000137044.fasta has DTRs/circularity no reads provided or reads not found Circular fasta file(s) detected

Putting non-circular contigs in a separate directory time update: running IRF for ITRs in non-circular contigs 10-12-22---00:21:13 /media/home/user11/miniconda3/bin/Cenote-Taker2/cenote-taker2.1.5.sh: line 464: /usr/bin/find: Argument list too long time update: running prodigal on linear contigs 10-12-22---01:20:08 ` Could you help me solve this problem?Is that because my input file has too much sequences?

Regards! Hazel

Stevenleizheng commented 1 year ago

I also have this question. I think this question is caused by our input big data of fasta format. The size of my fasta data is approximately 250M.

mtisza1 commented 1 year ago

Hi Hazel and Steven,

I agree that this issue is caused by the find command on line 464 of cenote-taker2.1.5 crashing because there are too many files in the folder. This has been a problem in the past and is a bit tricky for me to code around because some computers/nodes can handle a lot more files than others. I hope it's OK for you to split your input fasta and rerun. Quick way to split the input fasta in to chunks of up to 5000 contigs:

conda activate cenote-taker2_env
seqkit split input_contigs.fasta -s 5000

https://bioinf.shenwei.me/seqkit/usage/#split

I hope this works for you!

Mike