Acellera / htmd

HTMD: Programming Environment for Molecular Discovery
https://software.acellera.com/docs/latest/htmd/index.html
Other
261 stars 59 forks source link

transfer of code #1001

Closed sanusi340 closed 3 years ago

sanusi340 commented 3 years ago

I am new at using htmd, I have this code that works for htmd2, but it doesn't seem to work in htmd3, I will like to implement the code;

metr = Metric(fsims) data = metr.project()

stefdoerr commented 3 years ago

You need to specify a metric first like:

metr = Metric(fsims)
metr.set(MetricSelfDistance("protein and name CA"))
data = metr.project()
sanusi340 commented 3 years ago

yes I did . sorry forgot to add that please find below; metr = Metric(fsims) metr.set(MetricSelfDistance( 'protein and name CA', metric='distances')) data = metr.project()

stefdoerr commented 3 years ago

And what is the error?

sanusi340 commented 3 years ago

I keep getting a NAN as an output; moleculekit.readers - WARNING - Element So doesn't exist in the periodic table. Assuming it was meant to be element Na and renaming it. Projecting trajectories: 100%|██████████| 622/622 [01:29<00:00, 6.98it/s] 2021-05-10 08:31:34,604 - htmd.projections.metric - WARNING - Multiple framesteps [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan nan] ns were read from the simulations. Taking the statistical mode: nanns. If it looks wrong, you can modify it by manually setting the MetricData.fstep property.

stefdoerr commented 3 years ago

That's just the time-step of the simulation not being read. The projection works fine. You can set it to the correct value by changing data.fstep = XX after project where XX is the sampling rate of your simulation in nanoseconds.

giadefa commented 3 years ago

Maybe we can add a better error message if the timestep is 0 or nan?

On Mon, 10 May 2021 at 08:48, Stefan Doerr @.***> wrote:

That's just the time-step of the simulation not being read. The projection works fine. You can set it to the correct value by changing data.fstep = XX after project where XX is the sampling rate of your simulation in nanoseconds.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Acellera/htmd/issues/1001#issuecomment-836255976, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3KUOUEE7MZBHOHP4JWQ7LTM56U3ANCNFSM44JXG2OA .

-- http://www.acellera.com

   <https://twitter.com/acellera>

https://www.youtube.com/user/acelleracom https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera https://www.acellera.com/md-simulation-blog-news/ http://is.gd/1eXkbS

sanusi340 commented 3 years ago

metr = Metric(fsims) metr.set(MetricSelfDistance( 'protein and name CA', metric='distances')) data = metr.project() data.fstep = 1200 I added the new command line and I still got the same error has I described earlier

sanusi340 commented 3 years ago

Projecting trajectories: 100%|██████████| 1200/1200 [05:14<00:00, 3.82it/s] 2021-05-10 10:00:02,267 - htmd.projections.metric - WARNING - Multiple framesteps [nan, nan, nan, nan, nan, nan] ns were read from the simulations. Taking the statistical mode: nanns. If it looks wrong, you can modify it by manually setting the MetricData.fstep property.

stefdoerr commented 3 years ago

It's a warning, not an error. You can ignore it if you set the fstep. It will still print it. But I don't think that value is correct though. I cannot imagine that the sampling rate of the simulations is 1.2 microseconds.

sanusi340 commented 3 years ago

I actually did ignored the warning but ran into an error while calculating tica; tica = TICA(data, 2, units='ns') dataTica = tica.project(2)

data.cluster(MiniBatchKMeans(n_clusters=1000)) model = Model(data) model.plotTimescales()

RuntimeError Traceback (most recent call last)

in ----> 1 tica = TICA(data, 2, units='ns') 2 dataTica = tica.project(2) 3 4 data.cluster(MiniBatchKMeans(n_clusters=1000)) 5 model = Model(data) /software/software/HTMD/1.23.5/lib/python3.6/site-packages/htmd/projections/tica.py in __init__(self, data, lag, units, dimensions, njobs) 86 lag = unitconvert(units, 'frames', lag, data.fstep) 87 if lag == 0: ---> 88 raise RuntimeError('Lag time conversion resulted in 0 frames. Please use a larger lag-time for TICA.') 89 90 self.tic = TICApyemma(lag) RuntimeError: Lag time conversion resulted in 0 frames. Please use a larger lag-time for TICA.
stefdoerr commented 3 years ago

You need to set the fstep as I mentioned. Also 1200 fstep is very probably wrong. Can you show me the whole code up to the TICA call?

sanusi340 commented 3 years ago

from htmd.ui import fsims = simlist(glob('filtered//'), glob('filtered/filtered.pdb'), glob('filtered//'))

metr = Metric(fsims) metr.set(MetricSelfDistance( 'protein and name CA', metric='distances')) data = metr.project() data.fstep = 1200

tica = TICA(data, 2, units='ns') dataTica = tica.project(2)

data.cluster(MiniBatchKMeans(n_clusters=1000)) model = Model(data) model.plotTimescales()

stefdoerr commented 3 years ago

Yes, the issue is that 1200 is wrong. It needs to be the sampling rate of your simulation in nanoseconds. For example in ACEMD3 we usually sample simulation frames every 0.1ns. You need to check your simulation inputs and see what your sampling rate is.

stefdoerr commented 3 years ago

I added a better message now for nan fsteps in HTMD to make it less confusing

sanusi340 commented 3 years ago

I checked my simulation inputs and the timestep is actually 0.1ns and I redefined it in the code, although the warning is still there but I can proceed with the analysis; metr = Metric(fsims) metr.set(MetricSelfDistance( 'protein and name CA', metric='distances')) data = metr.project() data.fstep = 0.1

stefdoerr commented 3 years ago

Great. We can close the issue then :) The warning will stay, although I have put a slightly more meaningful one for future releases.