popgenmethods / pyrho

Fast inference of fine-scale recombination rates based on fused-LASSO
MIT License
44 stars 6 forks source link

Crash during hyperparam #3

Closed szpiech closed 5 years ago

szpiech commented 5 years ago

Hello,

I'm getting crashes while running the hyperparam portion of the pipeline. Any thoughts on what might be causing this?

pyrho hyperparam -n 46 \
--mu 2.5e-8 \
--blockpenalty 50,100,150 \
--windowsize 50,75,100 \
--logfile . \
--tablefile high_n_46_N_60_lookuptable.hdf \
--num_sims 500 \
--smcpp_file /data/szpiech/macaque/smcpp/high_altitude/estimate-high-all-four-default.csv \
--outfile high_hyperparam_results.txt

runs for a while but eventually leads to

Traceback (most recent call last):
  File "/home/szpiech/.conda/envs/pyrho/bin/pyrho", line 8, in <module>
    sys.exit(main())
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/site-packages/pyrho/frontend.py", line 67, in main
    func(args)
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/site-packages/pyrho/hyperparameter_optimizer.py", line 260, in _main
    pool)
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/site-packages/pyrho/hyperparameter_optimizer.py", line 127, in _score
    + [l2_norm, log_l2_norm])
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/multiprocessing/pool.py", line 424, in _handle_tasks
    put(task)
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/multiprocessing/connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/szpiech/.conda/envs/pyrho/lib/python3.6/multiprocessing/connection.py", line 393, in _send_bytes
    header = struct.pack("!i", n)
struct.error: 'i' format requires -2147483648 <= number <= 2147483647
jeffspence commented 5 years ago

Thanks for this. I can reproduce locally by setting --num_sims to be large (e.g. 5000). It seems like it's an issue with the pickling when there are a large number of items to iterate over.

A temporary work around is to set --num_sims to be smaller, but I'll try to get around to a more satisfying fix so I'll leave this open.

jeffspence commented 5 years ago

Fixed by commit 3cfe2388e3efd27301719f7f8b23ebab94aae165 Let me know if this doesn't fix it for you, but I have a suspicion that the cases where this would be an issue would also be too memory intensive to be practical.

szpiech commented 5 years ago

Thanks, I ended up stepping down to --num_sims 50, and that worked fine. I suppose 500 was overkill to start with.