MobleyLab / alchemical-analysis

An open tool implementing some recommended practices for analyzing alchemical free energy calculations
MIT License
114 stars 58 forks source link

suspected memory allocation problem #90

Closed raziel81 closed 8 years ago

raziel81 commented 8 years ago

Hi there, I've been trying over the past few weeks to run alchemical_analysis for a set of 20X2us (coarse-grained martini simulations) in which each file occupies 1.9 gb on the drive. After a few minutes my work station crashes from the calculation with the following message:

Traceback (most recent call last): File "./alchemical_analysis.py", line 1276, in main() File "./alchemical_analysis.py", line 1247, in main dhdl, N_k, u_kln = uncorrelate(sta=numpy.zeros(K, int), fin=nsnapshots, do_dhdl=True) File "./alchemical_analysis.py", line 164, in uncorrelate u_kln = numpy.zeros([K,K,max(fin-sta)], numpy.float64) # u_kln[k,m,n] is the reduced potential energy of uncorrelated sample index n from state k evaluated at state m MemoryError


The machine itself has 32GB ram which i'd imagine is more then enough but then again I might be wrong. Generally speaking, is there any well-defined upper cutoff for alchemical gromacs that is hardware related in terms of memory allocation?

Thanks in advance for your attention in this matter, Yoav

davidlmobley commented 8 years ago

These are GROMACS simulations? How many snapshots are you dealing with across how many lambda values?

The short answer is that alchemical analysis doesn't do its own memory allocation per se - we're using NumPy arrays. I'm expecting that this means that (a) these are very thoroughly tested, and (b) you really are running into the memory capacity of your machine. You can probably work this out for yourself by multiplying out how many snapshots you're attempting to load into memory (as 64-bit floats), determining what the minimum memory usage for that would be, and comparing it to your RAM.

One option would be pre-processing to remove extra snapshots - i.e. if you have 20 microsecond simulations and you saved snapshots every, say, picosecond or some such then you certainly have snapshots more often than you need. You could filter them so you only retain one in every 10 or one in every 100, etc.

I hope this helps.

raziel81 commented 8 years ago

Yes, the simulations are indeed via GROMACS and it’s a 2us per lambda simulation. I’m using a time step of 0.02 nstdhdl=10 so that should be about 10000000*20 snapshots=200000000...I’m not that programing savvy , but if I understand you correctly, shouldn’t this be around 200MB in total? Admittedly, that doesn’t account for the actual memory allocation for the aforementioned numpy array so that could be an issue in its own sake. I can reduce the output files, but my original intention was based in my concern for under-sampling the target molecule which is a coarse grained fatty acid, so I’m not entirely sure just how much would I reduce it without making that a problem as well. Thank you very much for your prompt reply and help in this matter. From: David L. Mobley Sent: Wednesday, August 17, 2016 5:54 AM To: MobleyLab/alchemical-analysis Cc: raziel81 ; Author Subject: Re: [MobleyLab/alchemical-analysis] suspected memory allocation problem (#90)

These are GROMACS simulations? How many snapshots are you dealing with across how many lambda values?

The short answer is that alchemical analysis doesn't do its own memory allocation per se - we're using NumPy arrays. I'm expecting that this means that (a) these are very thoroughly tested, and (b) you really are running into the memory capacity of your machine. You can probably work this out for yourself by multiplying out how many snapshots you're attempting to load into memory (as 64-bit floats), determining what the minimum memory usage for that would be, and comparing it to your RAM.

One option would be pre-processing to remove extra snapshots - i.e. if you have 20 microsecond simulations and you saved snapshots every, say, picosecond or some such then you certainly have snapshots more often than you need. You could filter them so you only retain one in every 10 or one in every 100, etc.

I hope this helps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

davidlmobley commented 8 years ago

Just created two numpy arrays of one element and five elements. Each uses eight bytes per element. http://stackoverflow.com/questions/11784329/python-memory-usage-of-numpy-arrays. Multiplying this out, it still looks to me like you should have enough memory, though perhaps I'm missing something.

BUT, I'm also confident you have vastly more snapshots than you need and attempting to analyze this many is going to lead to other problems aside from just memory problems (i.e. if you load all of them, it's going to take tons and tons of time to processs). I have to run off, but you should look up the idea of correlation time in molecular simulations; analyzing a whole bunch of snapshots which are all highly correlated is actually going to give you no additional information but bring a great deal of extra computational cost.

Presumably the reason you're running microsecond length simulations is that the correlation times are reasonably slow, so there is no reason to be analyzing 200000000 snapshots...

Best, David

halx commented 8 years ago

Your traceback suggests to me that you are trying to do an MBAR analysis which means creating a KxKxN matrix where K is the number of lambdas and N the number of timesteps.

On 17 August 2016 at 14:21, raziel81 notifications@github.com wrote:

Yes, the simulations are indeed via GROMACS and it’s a 2us per lambda simulation. I’m using a time step of 0.02 nstdhdl=10 so that should be about 10000000*20 snapshots=200000000...I’m not that programing savvy , but if I understand you correctly, shouldn’t this be around 200MB in total? Admittedly, that doesn’t account for the actual memory allocation for the aforementioned numpy array so that could be an issue in its own sake. I can reduce the output files, but my original intention was based in my concern for under-sampling the target molecule which is a coarse grained fatty acid, so I’m not entirely sure just how much would I reduce it without making that a problem as well. Thank you very much for your prompt reply and help in this matter. From: David L. Mobley Sent: Wednesday, August 17, 2016 5:54 AM To: MobleyLab/alchemical-analysis Cc: raziel81 ; Author Subject: Re: [MobleyLab/alchemical-analysis] suspected memory allocation problem (#90)

These are GROMACS simulations? How many snapshots are you dealing with across how many lambda values?

The short answer is that alchemical analysis doesn't do its own memory allocation per se - we're using NumPy arrays. I'm expecting that this means that (a) these are very thoroughly tested, and (b) you really are running into the memory capacity of your machine. You can probably work this out for yourself by multiplying out how many snapshots you're attempting to load into memory (as 64-bit floats), determining what the minimum memory usage for that would be, and comparing it to your RAM.

One option would be pre-processing to remove extra snapshots - i.e. if you have 20 microsecond simulations and you saved snapshots every, say, picosecond or some such then you certainly have snapshots more often than you need. You could filter them so you only retain one in every 10 or one in every 100, etc.

I hope this helps.

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MobleyLab/alchemical-analysis/issues/90#issuecomment-240408690, or mute the thread https://github.com/notifications/unsubscribe-auth/AH3aWo2mWQ8CNlODPBmBxSr5ILXg5vzbks5qgwrngaJpZM4JmCD_ .

raziel81 commented 8 years ago

yes, that’s true

From: Hannes Loeffler Sent: Wednesday, August 17, 2016 6:32 AM To: MobleyLab/alchemical-analysis Cc: raziel81 ; Author Subject: Re: [MobleyLab/alchemical-analysis] suspected memory allocation problem (#90)

Your traceback suggests to me that you are trying to do an MBAR analysis which means creating a KxKxN matrix where K is the number of lambdas and N the number of timesteps.

On 17 August 2016 at 14:21, raziel81 notifications@github.com wrote:

Yes, the simulations are indeed via GROMACS and it’s a 2us per lambda simulation. I’m using a time step of 0.02 nstdhdl=10 so that should be about 10000000*20 snapshots=200000000...I’m not that programing savvy , but if I understand you correctly, shouldn’t this be around 200MB in total? Admittedly, that doesn’t account for the actual memory allocation for the aforementioned numpy array so that could be an issue in its own sake. I can reduce the output files, but my original intention was based in my concern for under-sampling the target molecule which is a coarse grained fatty acid, so I’m not entirely sure just how much would I reduce it without making that a problem as well. Thank you very much for your prompt reply and help in this matter. From: David L. Mobley Sent: Wednesday, August 17, 2016 5:54 AM To: MobleyLab/alchemical-analysis Cc: raziel81 ; Author Subject: Re: [MobleyLab/alchemical-analysis] suspected memory allocation problem (#90)

These are GROMACS simulations? How many snapshots are you dealing with across how many lambda values?

The short answer is that alchemical analysis doesn't do its own memory allocation per se - we're using NumPy arrays. I'm expecting that this means that (a) these are very thoroughly tested, and (b) you really are running into the memory capacity of your machine. You can probably work this out for yourself by multiplying out how many snapshots you're attempting to load into memory (as 64-bit floats), determining what the minimum memory usage for that would be, and comparing it to your RAM.

One option would be pre-processing to remove extra snapshots - i.e. if you have 20 microsecond simulations and you saved snapshots every, say, picosecond or some such then you certainly have snapshots more often than you need. You could filter them so you only retain one in every 10 or one in every 100, etc.

I hope this helps.

— You are receiving this because you authored the thread.

Reply to this email directly, view it on GitHub, or mute the thread.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/MobleyLab/alchemical-analysis/issues/90#issuecomment-240408690, or mute the thread https://github.com/notifications/unsubscribe-auth/AH3aWo2mWQ8CNlODPBmBxSr5ILXg5vzbks5qgwrngaJpZM4JmCD_ .

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

raziel81 commented 8 years ago

ok David, thanks. What i think i'll do is just leave the initial 1us and reduce it as much as necessary till I manage to push through the calculation. Like Hannes mentioned, I am indeed using MBAR and that probably has a lot to do with the problems i'm having. i'll also do an autocorrelation analysis with GROMACS and see if i can just %N the number of rows based on the value of tau. Any other suggestions ?

On Wed, Aug 17, 2016 at 6:32 AM, David L. Mobley notifications@github.com wrote:

Just created two numpy arrays of one element and five elements. Each uses eight bytes per element. http://stackoverflow.com/ questions/11784329/python-memory-usage-of-numpy-arrays. Multiplying this out, it still looks to me like you should have enough memory, though perhaps I'm missing something.

BUT, I'm also confident you have vastly more snapshots than you need and attempting to analyze this many is going to lead to other problems aside from just memory problems (i.e. if you load all of them, it's going to take tons and tons of time to processs). I have to run off, but you should look up the idea of correlation time in molecular simulations; analyzing a whole bunch of snapshots which are all highly correlated is actually going to give you no additional information but bring a great deal of extra computational cost.

Presumably the reason you're running microsecond length simulations is that the correlation times are reasonably slow, so there is no reason to be analyzing 200000000 snapshots...

Best, David

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/MobleyLab/alchemical-analysis/issues/90#issuecomment-240411660, or mute the thread https://github.com/notifications/unsubscribe-auth/AOTcZnhXvCGLKPnheevy3kL44WIrxcLAks5qgw1sgaJpZM4JmCD_ .

davidlmobley commented 8 years ago

i'll also do an autocorrelation analysis with GROMACS and see if i can just %N the number of rows based on the value of tau

That sounds like a good idea to me.

raziel81 commented 8 years ago

Hey Prof. Mobley,

          Just wanted to tell you that after expanding my swap partition  on my workstation and removing dhdl output of the second microsecond from each trajectory, I managed to get the calculations working properly. Thanks a lot for your assistance in this matter its much appreciated.

All the best, Yoav

From: David L. Mobley Sent: Wednesday, August 17, 2016 1:45 PM To: MobleyLab/alchemical-analysis Cc: raziel81 ; Author Subject: Re: [MobleyLab/alchemical-analysis] suspected memory allocation problem (#90)

i'll also do an autocorrelation analysis with GROMACS and see if i can just %N the number of rows based on the value of tau

That sounds like a good idea to me.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.