debbiemarkslab / plmc

Inference of couplings in proteins and RNAs from sequence variation
MIT License
104 stars 38 forks source link

memory consumption leads to seg faults -- suggestion #5

Closed smsaladi closed 5 years ago

smsaladi commented 7 years ago

I was testing an large alignment (20k sequences by ~2800 positions) that ended being a bit gappy during my alignment. When running plmc without a focus sequence, the calculation runs into a segfault. Looking more closely, this resulted in an overflow for nFij. This was actually "fine" because malloc converts interprets its argument as unsigned.

However, of course, it eventually would segfault upon starting to actually calculate the second order marginals here. I'm not exactly sure why it didn't happen when initializing.

You are probably very aware that this should be the case, and, in retrospect, it makes sense to me. It might be nice for other novice users, like me, to provide a formula to estimate memory consumption in README.md or the plmc help string. Or even to exit the calculation with an error message saying that the alignment is too big based on this calculation of nFij or something.

smsaladi commented 7 years ago

Oh, whoops, I think I see this on the to-do list here: debbiemarkslab/EVcouplings#28