diyabc / abcranger

ABC random forests for model choice and parameter estimation, pure C++ implementation
https://diyabc.github.io
MIT License
9 stars 4 forks source link

Issue with parameter estimation when more than 1 scenario present? #94

Closed mariaemilyd closed 1 year ago

mariaemilyd commented 1 year ago

Hi,

There seems to be an issue with parameter estimation when more than one scenarios are compared. Parameter estimates from a model specified as "scenario 2" in the header file always result in some strange estimates outside the range of the priors. If I put the same evolutionary model under "scenario 1" in the header file in a different DIYABC-RF run, it produces normal parameter estimates.

I am estimating population sizes and divergence times. The issue seems to be with the divergence times only.

N.B I am sharing one set of model output. I have different runs for slightly different models and datasets as well, where I had the same issue, and could upload those for comparison if needed.

I could not attach the reftable.bin file as it was not a supported file type - I can email if needed? The scenario pictures have crossed lines because of the way the image was compiled in my browser (too narrow).

Thank you very much for your help,

Maria

diyabc_seed_init_call.log first_records_of_the_reference_table_0.txt headerRF.txt r_tarandus_sval_model_4_genepop_new_2022_11_15.txt reftable.log statobsRF.txt abcranger_call.log diyabc_run_call.log historical_scenario1 historical_scenario2

fradav commented 1 year ago

Hi Maria,

Which divergence time in the second scenario are you trying to infer ?

mariaemilyd commented 1 year ago

Hi,

I'm trying to estimate all 4 in scenario 2; t1, t2, t3, t4. In this case, I got: t1 median = 936 ( 95% CI 347-1515.05)
t2 median = 5.76233e-07 (2.39614e-07-9.0605e-07) t3 median = 10.3592 (1.60856-19.2461) t4 median = 8035.76(3814.96-13584.1)

So t2 and t3 are strange.

In a different ABC run, I put the evo model (that is listed under scenario 2 in this DIYABC run) in scenario 1 in the header file. Then I got these parameter estimates: t1 median = 1129 ( 95% CI 414-1529.91)
t2 median = 3140 (2888-3379) t3 median = 6166.08 (3335-12122.9) t4 median = 9419 (4429.93-13870)

And in that DIYABC run, the other scenario (which is scenario 1 in the current model, scenario 2 in the other model) has strange estimates for t5: t5 median: 9.83701 (1.13867-19.1695) t6 median: 4184.66 (1267-11174.6) t7 median: 8821.6 (3124.2-13884.3)

Thanks

fradav commented 1 year ago

I am able to reproduce the bogus values for t2 in param. estim. Investigating…

mariaemilyd commented 1 year ago

Hello, any luck looking into the cause of this? Any more files I can provide? Thanks!

fradav commented 1 year ago

Yes it has been pinpointed to an error from my part in indexing the mutational parameters (instead of t2 you were estimating μseq_1) fixed release is coming.

mariaemilyd commented 1 year ago

Hi, that's great news! Thank you for looking into this so quickly. Do you have a rough idea of when the new release would come? Only because the models form part of my results for my PhD which is ending soon, and so rerunning the models with the new release would be much needed. Thanks!

fradav commented 1 year ago

This evening or tomorrow for the cli binaries, once I fix something stuck in the CI pipeline. For the windows gui release it will be a tad longer as I have to ask the diyabcGUI maintainer. What's your system?

mariaemilyd commented 1 year ago

Oh that's fast! I'm running windows gui unfortunately, but if it's within the timeframe of weeks and not months, it should be okay.

fradav commented 1 year ago

Latest release fixed (cli binary only). Thanks you anyway for reporting this bug and taking the time to get on us again, you were very constructive.

Btw, is your PhD in population genetics?

mariaemilyd commented 1 year ago

Thanks! When you find out when the GUI will be released, please let me know. Glad to hear it was helpful!

Yes, it's looking at the effect of postglacial climate changes on the population histories of two Arctic species, and how the genetic legacy from that may impact response to future climate changes.

mariaemilyd commented 1 year ago

Hi, just to let you know I have been using the command line diyabc RF and abcranger and they are working great, the time parameter estimation looks good. Quick question about estimating mutation rate parameters though - netiher MEANMU (as in the header file) or µseq_1 (as in the GUI) seem to be recongised as parameter names? Thanks!