StuntsPT / Structure_threader

A wrapper program to parallelize and automate runs of "Structure", "fastStructure" and "MavericK".
GNU General Public License v3.0
24 stars 11 forks source link

Numpy error during MavericK normalization #78

Closed cathynewman closed 6 years ago

cathynewman commented 6 years ago

I am running MavericK via Structure_threader on a Linux HPC server (RHEL 6), with GCC 6.4.0 and python 3.5.2 (Anaconda), which includes Numpy v. 1.11.3. I did a test run on a data set with 11 individuals and 1,000 SNPs, running K = 1-3, with 3 reps each K. I think all of the runs finished successfully and the outputEvidence.csv file was generated. But I get the following error during what looks like the normalization step:

[EDITED to clarify: the error is during the normalization step of merging, after all of the runs have completed and the merged outputEvidence.csv file has been created.]

INFO: All 3 jobs finished successfully.
Traceback (most recent call last):
  File "/home/cenewman/.local/bin/structure_threader", line 11, in <module>
    sys.exit(main())
  File "/home/cenewman/.local/lib/python3.5/site-packages/structure_threader/structure_threader.py", line 332, in main
    full_run(arg)
  File "/home/cenewman/.local/lib/python3.5/site-packages/structure_threader/structure_threader.py", line 286, in full_run
    arg.notests)
  File "/home/cenewman/.local/lib/python3.5/site-packages/structure_threader/wrappers/maverick_wrapper.py", line 312, in maverick_merger
    bestk = _write_normalized_output(evidence, k_list, ti_in_use)
  File "/home/cenewman/.local/lib/python3.5/site-packages/structure_threader/wrappers/maverick_wrapper.py", line 258, in _write_normalized_output
    k_list))
  File "/home/cenewman/.local/lib/python3.5/site-packages/structure_threader/wrappers/maverick_wrapper.py", line 335, in maverick_normalization
    for _ in range(draws)])
  File "/home/cenewman/.local/lib/python3.5/site-packages/structure_threader/wrappers/maverick_wrapper.py", line 335, in <listcomp>
    for _ in range(draws)])
  File "mtrand.pyx", line 1902, in mtrand.RandomState.normal (numpy/random/mtrand/mtrand.c:19771)
ValueError: scale <= 0

Here is how I call Structure_threader:

export HOME_DIR=/home/cenewman
export WORK_DIR=/work/cenewman/popgen/maverick
cd $WORK_DIR

module load gcc/6.4.0
module load python/3.5.2-anaconda-tensorflow

structure_threader run -K 3 -i data.txt -o output --params parameters.txt -t 20 -mv $HOME_DIR/MavericK-1.0.5/MavericK
StuntsPT commented 6 years ago

Hi @cathynewman, Thank you for reporting this issue. After looking at what you posted, I suspect this may be an issue with your version of numpy since the earliest we have tested it with was 1.12.1. So, before anything else, can you reproduce the issue with a more recent version of numpy? If that turns out to be the problem, we should put up a warning regarding this, as well as a version number in the requirements.

PS - If you want to test again, but do not want to spend time running MavericK again, edit the file /home/cenewman/.local/lib/python3.5/site-packages/structure_threader/structure_threader.py and comment line 281. This should skip the runs. But in order to do that, you must not delete previous results and run the same command again.

cathynewman commented 6 years ago

It seems that was the problem! I was using Python 3 as a pre-installed module managed by the HPC system, so I'm not sure there was a way to update numpy. But instead I installed miniconda with Python 3, then grabbed the latest numpy through conda. I tried the exact same run again, this time using my new install of Python 3.6.5 with numpy 1.14.2. It threw the same error again. But then I reinstalled Structure_threader, and that fixed the issue. It seems to be running correctly now. Thanks!

StuntsPT commented 6 years ago

Awesome! I'm happy it's sorted. I'll leave the issue open until I add the version of numpy to the install script. Once again, thank you for taking the time to report the problem.

StuntsPT commented 6 years ago

Ok, I just fixed this issue, and a new version of *Structure_threader" is now available in pypi. This type of issues should not happen any more. Once again, thank you @cathynewman for reporting it.