Closed abshah closed 7 years ago
@abshah Could you please try the lastest code for those FASTA data ? Is it possible that sending those data to me for debug ?
Hi @cchd0001
I just recompiled SOAPdenovo2 with the latest version from github. I believe the issue was with loading multi-line FASTA files. Usually the SRA-toolkit outputs reads in FASTA format in a multi-line format. You can download the data files using fastq-dump (from SRA-toolkit)
fastq-dump --split-spot --split-files --clip --fasta SRR764591
If you convert the same file to 2-line FASTA format, the issue appears to go away.
Best, Abhijeet
Hi @abshah
Sorry for the delay .
Something wrong with my network and I can't download your data. However , I make a multi-line FASTA
simulate data and trigger the bug.
It is true that SOAPdenovo2 don't support multi-line FASTA format file . It assume a FASTA file must be a 2 line FASTA and a FASTQ file must be a 4 line FASTQ .
However , instead of enter a infinite loop, the latest code ( which you already tested) can detect those format and exit program with warning looks like :
Import reads from file:
MP4000_shuffled_origin.fa
readseqInLib return error! please make sure input file is correct fastq/fasta file
invalid data left in buffer:
CACCAATTCTAAGCATTAAGCTTtttctttattttctttctttcttttccttctttc
tttctctctctctctctttctttctttctctctctctctttctttctctttcttcct
tccttccttccttccttccttccttctctccttaattgtgggaaaatataaataaac
taaaactcatcattttcacctt
There is no plan for support multi-line FASTA format file . Hope this can help . Best wishes Lidong Guo
Hi @cchd0001 Thanks for adding this in.
Best Wishes, Abhijeet
Hi SOAPdenovo2 devs, I have just noticed a strange issue. Whenever I input FASTA files (using the f1,f2 flags in the configuration file), the program just goes into a loop of trying to process the reads. I am using version 2.40 (installed from bioconda) and my command was:
SOAPdenovo-63mer all -s /vol/assembly/config_Lmig_PE90_20K_insert.txt -K 43 -R -p 16 -o PE90_20K_ins 1> assembly_PE90.log 2> assembly_PE90.error
and my log file looks like this:........
However, when I switch to FASTQ input files, the problem disappears.
Best Wishes, Abhijeet