jonassibbesen / rpvg

Method for inferring path posterior probabilities and abundances from pangenome graph read alignments
MIT License
47 stars 6 forks source link

Crash: Assertion `best_align_score <= optimal_score' failed #34

Closed lsoldini closed 2 years ago

lsoldini commented 2 years ago

1. What were you trying to do?

Infer expression from .gamp file from vg mpmap.

I have a bunch of .fastq file (replicates and different treatments), and I runned them through an array in vg mpmap. The process seemed to have worked properly (exit: 0).

I then made some test on rpvg using:

bin/rpvg -g splicepangenome.xg -p pantranscriptome.gbwt -f pantranscriptome.txt.gz -a xxx.gamp -o rpvg --inference-model haplotype-transcripts

With xxx being different samples.

2. What actually happened?

What is weird is that it worked for all but one sample (only 4 tested in total) - i.e., I wanted to try some stuff before running on whole data. Also, I have already re-run vg mpmap on that sample, but still got the same error.

Here is the error message I get:

~/work/rpvg/bin/rpvg -g splicepangenome.xg -p pantranscriptome.gbwt -f pantranscriptome.txt.gz -a OvMM1_L6.gamp -o rpvg --inference-model haplotype-transcripts
Running rpvg (commit: 0380bdb172bbe255a18ed070935fa0013dc02548)
Random number generator seed: 1653473836
Fragment length distribution parameters found in alignment (mean: 321.48, standard deviation: 137.106)
Loaded graph, GBWT and r-index (0.217732 seconds, 0.361549 GB)
rpvg: /users/lsoldini/work/rpvg/src/alignment_path_finder.cpp:472: std::vector<AlignmentSearchPath> AlignmentPathFinder<AlignmentType>::extendAlignmentSearchPath(const AlignmentSearchPath&, const vg::MultipathAlignment&) const [with AlignmentType = vg::MultipathAlignment]: Assertion `best_align_score <= optimal_score' failed.
Aborted (core dumped)
lsoldini commented 2 years ago

Hi again rpvg-team,

As a follow-up, I would also have some related questions, but not linked to a crash:

My apologies for the many questions all at once, and thank you for your time reading me.

Best, Luca

lsoldini commented 2 years ago

I have now done several tests with rpvg, but it keeps throwing the same error:

Assertion `best_align_score <= optimal_score' failed

The exit code is 134.

It's weird because it works on the example data.

In particular, I have three .gamp file (each one technical replicate = one different sequencing lane) for each biological replicate. Individually, some technical replicate do not throw an error, but as soon as I merge the technical replicates, it throws the same error for all samples.

I've tried whether doing the cat step before or after vg mpmap would change something, but it did not change anything and I got the same error.

And it does not seem to be linked to memory issues (e.g., each merged .gamp file is about 7 Gb), but I've added up to 64G RAM for one unique .gamp and the memory usage is low anyway.

Would you have any suggestion ?

jonassibbesen commented 2 years ago

Hi Luca,

Re the crash. What parameters did you use for mpmap? Did you use the default scores? This error could happen if a different set of scores was used for mpmap than the default. If you used the default then I would probably need to look at the data to find the issue. Would it be possible to share the input data and one of the gamp files that crashed? You can use this email: j.a.sibbesen@gmail.com

Re the general questions. Could you create a separate issue (or multiple if you prefer) with the question(s) since they are not related to the crash. Then it would be easier for other users to find the answers if they have similar questions. Thanks!

Best,

Jonas

lsoldini commented 2 years ago

Hi Jonas,

It is exactly what you said! I have used vg mpmap -e high, but I should actually have used -e low (i.e., the reads were trimmed such that Phred > 20, and most are > 30).

I have just realised this few hours ago, and I have re-run vg mpmap and rpvg. As of now, it is not finished (1/3 done), but it seemed to have worked just fine. I'll close this issue, and open new-ones for the other questions.

Thanks!

Best, Luca