Closed novikk closed 5 years ago
Hey,
Segmentation faults seems to be pretty dataset dependent. Never encountered one (that I did not fix) with all the experiments I've ran (incl. human) so far.
Few questions so I can further help you:
1) Does CONSENT still corrects a few long reads, then crashes, or does it fails to perform correction at all?
2) Have you tried CONSENT on any other dataset? Any small dataset (at least 10x coverage) from any bacterial genome would do, that'd be just to know if the error appears whatever it is you attempt to correct.
3) Is your data public? Found this link (https://trace.ddbj.nig.ac.jp/DRASearch/run?acc=SRR6058582) Googling the accession ID of your LR file, but no fasta file available for download. If it is public, and if you can provide me with a link to download it, I'd be glad to run CONSENT on your dataset and spot the segfault.
Cheers, Pierre
Yesterday, I encountered this problem, too. My data are 16s full-length nanopore sequencing reads because I have PAF file already, and then I used CONSENT command directly:
$ CONSENT -a Alignments_32234.paf -s 4 -S 1000 -l 500 -k 9 -c 8 -A 2 -f 4 -m 50 -j 20 -r input.fa -M 150 >> output.fa
Segmentation fault (core dumped)
Hi,
I'll have the same 3 questions as just above, please.
Just knowing that there's a segfault somewhere doesn't help me so much if I can't reproduce it to see where it comes from.
Pierre
@morispi
Hope this helps!
@novikk
Great! Thanks for the answers and for the link to the dataset.
CONSENT was indeed designed for DNA reads, but I don't see any reason for it to crash on RNA if you switch U to T? Actually I tried it myself this morning on a tiny dataset containing Us, and all went well.
Downloading the data and investigating the issue later tonight. I'll keep you updated.
Cheers P
@novikk Just checked your data. Problems comes from the fact your long reads header contain spaces. Changing the spaces to underscores does the trick for me.
@godkin1211 maybe that's the same thing for you? If the original reads file contains spaces, that'd explain the problem. As you already have the PAF file however, you should just plain trim everything that follows the first white space (sed 's/ .*//g') so that CONSENT can work.
@novikk However, it seems like your reads have a mean length of 152bp? You should check CONSENT parameters and adapt them, as they are meant to be used for much longer reads, that are divided into 500bp windows. Windows longer than the actual reads might cause further issues :p
Cheers P
Thanks to @morispi ! It works after replacing those spaces in header.
@godkin1211 Great!
@novikk
Did you manage to run CONSENT in the end? Did you also check my comment about the parameters above?
Waiting on your answer to close the issue. :)
Cheers Pierre
Hi @morispi, will check it ASAP, probably tomorrow!
Worked fine after renaming the headers of the FASTA and tuning the "windowSize" parameter.
Thanks!
I'm trying to correct a dataset of real ONT reads and I'm getting a segmentation fault error after the mapping with minimap2:
I've tried on two different datasets and I'm getting the same error.
OS info: