arzwa / wgd

Python package and CLI for whole-genome duplication related analyses. This package is deprecated in favor of https://github.com/heche-psb/wgd.
http://wgd.readthedocs.io/en/latest/
GNU General Public License v3.0
80 stars 40 forks source link

Hang up due to codeml and fix #72

Open mason-linscott opened 2 years ago

mason-linscott commented 2 years ago

Hello,

I am using wgd to examine ancient genome duplication in my focal group.

I ran wgd pre on two focal species to filter cds to those that would be accepted by wgd. All went well.

I then aligned the data with wgd dmd with no problem.

However, I came across an error when using the wgd ksd. Here is my command and output:

python3 wgd/wgd_cli.py ksd -n 40 wgd_dmd/3h_reduced.fa.pre.good_cand_reduced.fa.pre.good.rbh 3h_reduced.fa.pre.good cand_reduced.fa.pre.good

...
100% (22670 of 22670) |######################################################################################################################################| Elapsed Time: 0:00:05 Time:  0:00:05
2021-12-01 11:27:57: WARNING    There were 4 warnings during translation
2021-12-01 11:27:57: INFO       Started whole paranome Ks analysis
2021-12-01 11:27:57: WARNING    Filtered out the 0 largest gene families because n*(n-1)/2 > `max_pairwise`
2021-12-01 11:27:57: WARNING    If you want to analyse these large families anyhow, please raise the `max_pairwise` parameter. 
2021-12-01 11:27:57: INFO       Started analysis in parallel (n_threads = 40)
2021-12-01 11:27:57: INFO       Performing analysis on gene family GF_000001
2021-12-01 11:27:57: INFO       Performing analysis on gene family GF_000002
...
2021-12-01 11:28:02: INFO       Performing analysis on gene family GF_000041
2021-12-01 11:28:02: INFO       Performing analysis on gene family GF_000042

And then it just hangs there for hours.

I noticed that after the first gene family, no other KS files were generated. So, I tried running codeml on one of the tmp .cntrl files and got the following warning:

"4 columns are converted into ??? because of stop codons
Press Enter to continue"

Pressing enter then resulted in the appropriate files being generated. I guessed that all of my threads were hung up waiting on key input. So, I implemented a fix found in the paml google group for getting rid of the hang up (https://groups.google.com/g/pamlsoftware/c/HNx4O_YMHVA) and all ran smoothly after recompilation of paml.

Just thought I should mention it here in case anyone else runs into the issue.

M.

arzwa commented 2 years ago

Thanks a lot for reporting this, I'll try to get a look to have the wgd pipeline figure out when this will happen and report a warning or so when I find the time. In the meantime I hope that indeed people who run into the same issue find your bug report and workaround here, thanks again!