trevorcousins / cobraa

cobraa (coalescent based reconstruction of ancient admixture): a hidden Markov model to infer ghost admixture using just a diploid sequence.
MIT License
2 stars 0 forks source link

PANMIXIA not defined #1

Closed marqueda closed 2 months ago

marqueda commented 2 months ago

Dear Trevor,

I am currently testing cobraa - a super exciting method you developed there! - and got the following error message:

Traceback (most recent call last): File "cobraa.py", line 353, in if PANMIXIA: ^^^^^^^^ NameError: name 'PANMIXIA' is not defined

I have been running the following code python cobraa.py -in -D 32 -b 100 -ts 10 -te 20 -its 20 -o -mu_over_rho_ratio 0.1

I am running cobraa on python 13.11.6, with package versions numba 0.59.1, joblib 1.4.2, matplotlib 3.9.0, pandas 2.2.2, psutil 5.9.8, scipy 1.13.1

I wonder whether that might be a bug somewhere in the code, or related to any options I am feeding the script?

Thank you for having a look at this, let me know if I should send you the log file.

Best, David

trevorcousins commented 2 months ago

Hi David,

I tried a similar command line you gave, which worked fine for me. I suspect it may be either a problem with the input mhs file or a version issue. Note the versions that I use are:

python 3.10.5 numba 0.55.2 joblib 1.1.0 matplotlib 3.5.2 pandas 1.4.3 psutil 5.9.1 scipy 1.8.1

Can you send the log file? Can you also show the first few lines of the mhs file. Let's have a look at those, and if they don't work you might have to try the different versions.

marqueda commented 2 months ago

Dear Trevor,

Thank you very much for your reply! I am attaching the log file and one of the input files. Thank you for having a quick look at them. I am not sure I can install exactly the package versions you used in our cluster environment, so I hope I am doing something else wrong :-)

Best wishes, David cobraarun_57.log [PpunlikeVIC.chr1.phased.msmcin.txt]

trevorcousins commented 2 months ago

Thanks for these, I think I've found the issue. According to the log file, it says the command line you gave was : Command line: python /cluster/work/gdc/shared/p703/bin/cobraa/cobraa.py -in <files> -D 32 -b 100 -ts 10 -te -its 20 -o results/PpunlikeVIC.ts10.te20 -mu_over_rho_ratio 1 You are missing the value of te, so I think the command should be : Command line: python /cluster/work/gdc/shared/p703/bin/cobraa/cobraa.py -in <files> -D 32 -b 100 -ts 10 -te 20 -its 20 -o results/PpunlikeVIC.ts10.te20 -mu_over_rho_ratio 1 I think that should work for you 😄 . Best, Trevor

marqueda commented 2 months ago

Well, that is an embarassing mistake! Thank you for spotting it and sorry I did not figure that out myself before opening an issue... it must have happened in the submission script that loops through all value combinations for ts and te.

Thanks again and best wishes! David

trevorcousins commented 2 months ago

No problem, I should write a better error catch.

As a side note, I notice that you are only running for 20 iterations. I strongly suspect that is not enough iterations for the algorithm to converge, and would recommend you set the thresh parameter to one and turn up iterations to 100 or something. And, of course, you will want to compare the fit of the structured model to the fit of the panmictic model (using -ts None -te None, i.e. running PSMC).

Best,

Trevor

marqueda commented 2 months ago

Dear Trevor,

Thank you for the hint, I have increased iterations to 100 and set thresh to 1, and I am of course also running the PSMC version to compare with.

Best wishes, David