lh3 / psmc

Implementation of the Pairwise Sequentially Markovian Coalescent (PSMC) model
Other
146 stars 60 forks source link

"Illegal division by zero" when executig psmc.plot.pl - sample.psmc file with NaNs #39

Open AliceIob opened 2 years ago

AliceIob commented 2 years ago

Hi! I am trying to use psmc, and it seems to work fine (i.e. without rising errors) up to the plotting part. I always get the same error:

Use of uninitialized value in division (/) at psmc_plot.pl line 119, <> line 1107.
Illegal division by zero at psmc_plot.pl line 119, <> line 1107.

If I look at my sample.psmc file however, it is mostly 0s and NaNs. Any idea on how to fix this? I can't figure if this is a problem of the script or of my data (e.g. low coverage, few heterozygote positions).

Thanks!

plubbe commented 2 years ago

Typically this error is caused when there was a problem generating the file, either because the input is corrupted, or because it can't be found. Unfortunately the psmc command won't crash in these cases, but creates the wonky NaN's/0's in the output file. So double-check your input and your file paths to make sure everything is as it should.

On Thu, Oct 7, 2021, 20:56 AliceIob @.***> wrote:

Hi! I am trying to use psmc, and it seems to work fine (i.e. without rising errors) up to the plotting part. I always get the same error:

Use of uninitialized value in division (/) at psmc_plot.pl line 119, <> line 1107. Illegal division by zero at psmc_plot.pl line 119, <> line 1107.

If I look at my sample.psmc file however, it is mostly 0s and NaNs. Any idea on how to fix this? I can't figure if this is a problem of the script or of my data (e.g. low coverage, few heterozygote positions).

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/lh3/psmc/issues/39, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOMIXZS3SLZMXWKI6RH6YJTUFVHBTANCNFSM5FQUDQZQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

AliceIob commented 2 years ago

Hello! I was not able to fix this issue and I would really appreciate some help here. My paths and files are ok. I do start my analysis from a single-sample vcf file, on which I apply bcftools consensus to extract the consensus sequence. Everything seems to be fine up to this point.
The conversion of my consensus fasta into psmcfa (/psmc/utils/fq2psmcfa) is performed but it seems that the resulting file has no variants (i.e. no Ks in the sequence) - I guess this is what causes the subsequent error. I found more than one script that makes the conversion from fasta to psmcfa, but what it's still not clear to me is the file format and how it should look like:

  1. consensus fasta: bcftools/vcfutils.pl vcf2fq has been substituted by bcftools consensus. I don't understand if the output of bcftools consensus is the same as the output of vcf2fq. So, how should the input for fq2psmcfa look like? Should it be one long sequence for each chromosome or many short sequences for each chromosome?
  2. how should psmcfa file look like? The info I could find says that Ts are for non-variant sites and Ks for variant sites, is that correct? I add one example of my files here. consensus_fasta consensus_psmcfa

Thank you in advance

elizeng commented 2 years ago

Hi @AliceIob

Not sure if you have solved your issue. I am still in the testing phase of running psmc. I initially had the same issue as you where I had 0s and NaNs in the psmc file.

And I realised that the input into psmc had to be minimally processed. The files that worked were reads that were mapped and put through samtools view/sort before used with psmc.

I had initially assumed that further filtering would be better, i.e. putting it through picard and GATK, but that led to what you have, and wondered if you did the same.

Cheers!

Buffan3369 commented 1 year ago

Hi @AliceIob ,

I was faced with a similar issue and was able to fix it without coming back to some reads processing issues!! I might precise was working with the output of psmc with bootstrapping.

So, quite naively, I referred to the indexes indicated in the error message and sought the corresponding lines in my .psmc input. Especially, I realised the first pointed line (which, in my case, also happened to be the 119th) belonged to a set of iterations that apparently didn't converge (with NaNs and zeros, and $\lambda_k$ column fixed to one across the iterations). What I did was therefore to get rid of that set of iterations by finding the line where the set was ending and taking the tail of my initial combined.psmc file starting at the total length of my file minus the line where the set problematic non-converging set of iterations was ending. Afterwards, repeating the plotting command with this combined.headed.psmc file worked.

I hope this explanation is not too messy... all in all, I would recommend you to have a look at the 119th line of your input, see if it refers to a non-converging set of iterations (i.e for all step $i$ and for all time interval $t$, $\lambda{i,t}=1$ (third column) and $\pi{i,k}=nan$ (fourth column)). If yes, try to subset your input file by eliminating this set of iterations and re-run the plotting script.

Hope it might help! All the best, Lucas

YstTatyana commented 1 year ago

Hi! I have the same error when I'm trying to psmc_plot.pl. Illegal division by zero. All files are ok, but I have 0 in the last colomn of .psmc file only in first iteration and one 0 in every RS 0 line . For exaple: RS 0 0.000000 0.543597 8678.992408 0.015763 0.115325 That's the reason why division by zero appears ?

Thanks!

Jiangjiangzhang6 commented 9 months ago

Hi! I have the same error when I'm trying to psmc_plot.pl. Illegal division by zero. All files are ok, but I have 0 in the last colomn of .psmc file only in first iteration and one 0 in every RS 0 line . For exaple: RS 0 0.000000 0.543597 8678.992408 0.015763 0.115325 That's the reason why division by zero appears ?

Thanks!

no, I with the same error for 100 bootstrap, but the main result diploid.psmc, its ok, I check the file, its also have the RS 0 0.000000, so may be the other reason.

Jiangjiangzhang6 commented 9 months ago

Hi! I have the same error when I'm trying to psmc_plot.pl. Illegal division by zero. All files are ok, but I have 0 in the last colomn of .psmc file only in first iteration and one 0 in every RS 0 line . For exaple: RS 0 0.000000 0.543597 8678.992408 0.015763 0.115325 That's the reason why division by zero appears ? Thanks!

no, I with the same error for 100 bootstrap, but the main result diploid.psmc, its ok, I check the file, its also have the RS 0 0.000000, so may be the other reason.

need to split the psmcfa, i check the last years notebook, hope help image

lpabonv commented 8 months ago

any of you had any trouble while producing the graph? for some of my data it puts just a straight line, with no changes in population size, even in all the bootstrap. You might have any idea what the issue might be?