MDAnalysis / mdanalysis

MDAnalysis is a Python library to analyze molecular dynamics simulations.
https://mdanalysis.org
Other
1.28k stars 646 forks source link

Support for reading VASP trajectories? #3351

Open francescalb opened 3 years ago

francescalb commented 3 years ago

Is your feature request related to a problem?

I am doing molecular dynamics simulations in VASP and would like to use MDAnalysis to analyze the trajectories.

Describe the solution you'd like

A module that reads XDATCAR would be great.

Describe alternatives you've considered

I have tried loading the trajectory into ase and printing out as netcdf-trajectory. The resulting trajectory is not compatible with the reader in MDAnalysis. Pymatgen can read XDATCAR (and vasprun.xml). Another solution is to load XDATCAR into VMD, and make a topology and write out the trajectory in another format in VMD. However, I would like to keep it all in python.

Additional context

I am happy to contribute, but am cmopletely new to MDAnalysis. Reading about it, it seems like a perfect tool for my needs if I can get it to cooperate with the VASPoutput.

IAlibay commented 3 years ago

@francescalb If possible, would you happen to have a reference for the XDATCAR file format specification (couldn't find anything from a quick google search)?

Any contributions from someone that is familiar with the file format would be greatly appreciated here. This would greatly reduce the time to implementation.

francescalb commented 3 years ago

Hi, that is a very good point. I could not find any official descriptions of it either.

I copy the top of one here: unknown system 1 11.108875 -0.014682 -0.006484 0.000000 11.106563 0.001404 0.000000 0.000000 11.106293 C H O 32 64 24 Direct configuration= 10 0.35157094 0.22360895 0.40462503 0.50632805 0.07070679 0.68686311 0.56531728 0.28355504 0.67202291 0.72437750 0.06771067 0.62193312 0.45327393 0.66702060 0.79267676 0.19968824 0.74235361 0.56426464 0.35931092 0.60242174 0.47982937 0.09008319 0.06187967 0.65942471 0.07418150 0.71178395 0.60612875 0.29900567 0.27726531 0.82503723 0.11692040 0.18859112 0.71015165 0.09923059 0.41085042 0.12426164 0.07662911 0.18133891 0.26263686 0.15103257 0.43905662 0.79378168 0.18685493 0.21035733 0.06298142 0.05716790 0.07192059 0.34710470

Basically it says that the first 32 lines are the positions of individual C-atoms, the next 64 of H atoms etc.

The line is just a comment line, then there is the unit cell specification. And then this is repeated for each (printed) step.

This is the code for how it is read in Pymatgen (I see it is more complicated than what I expected): https://github.com/materialsproject/pymatgen/blob/v2022.0.8/pymatgen/io/vasp/outputs.py#L4268-L4491

This is how ase is doing it (simpler, might not include all possibilities pymatgen version does): https://wiki.fysik.dtu.dk/ase/dev/_modules/ase/io/vasp.html#read_vasp_xdatcar

Francesca

From: Irfan Alibay @.> Sent: tirsdag 8. juni 2021 20.53 To: MDAnalysis/mdanalysis @.> Cc: Francesca Lønstad Bleken @.>; Mention @.> Subject: Re: [MDAnalysis/mdanalysis] Support for reading VASP trajectories? (#3351)

@francescalbhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ffrancescalb&data=04%7C01%7Cfrancesca.l.bleken%40sintef.no%7C75cc56785b40406a09dd08d92aaeb54a%7Ce1f00f39604145b0b309e0210d8b32af%7C1%7C0%7C637587752118754307%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=tW1SfP8E6iLsiSMDgMW4q8aUwv4wWlHPvToQX8%2FTZSc%3D&reserved=0 If possible, would you happen to have a reference for the XDATCAR file format specification (couldn't find anything from a quick google search)?

Any contributions from someone that is familiar with the file format would be greatly appreciated here. This would greatly reduce the time to implementation.

- You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMDAnalysis%2Fmdanalysis%2Fissues%2F3351%23issuecomment-857013086&data=04%7C01%7Cfrancesca.l.bleken%40sintef.no%7C75cc56785b40406a09dd08d92aaeb54a%7Ce1f00f39604145b0b309e0210d8b32af%7C1%7C0%7C637587752118759285%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=fpZ2wx9SriVvfHmETVrXUeIzxcyzWN1SGNAm2CVUCIo%3D&reserved=0, or unsubscribehttps://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FALPGAD6PRTHC5JSFCFNSHQDTRZRKTANCNFSM46KQB3AA&data=04%7C01%7Cfrancesca.l.bleken%40sintef.no%7C75cc56785b40406a09dd08d92aaeb54a%7Ce1f00f39604145b0b309e0210d8b32af%7C1%7C0%7C637587752118764264%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=r9MhR2wJXLDOYud3cB%2F8v69Cr%2BNflQ89Xu9ntMl8kVo%3D&reserved=0.

orbeckst commented 3 years ago

@francescalb ,

  1. Are pymatgen/ASE limited in how they can read VASP files, e.g., do they not represent them as trajectories, or are there any other extensions to capabilities that you need?
  2. Do you also use pymatgen or ASE in your work? Or are you looking for a way not having to install any of these other tools?

I am asking because for MDAnalysis we could also try to write a converter that would, say, take a pymatgen object as input and then reads the data through the pymatgen code.

orbeckst commented 3 years ago

@francescalb did you get a chance to think about my questions in https://github.com/MDAnalysis/mdanalysis/issues/3351#issuecomment-857856274 ? Having answers to these questions would help us to judge what the best way forward would be.

hmacdope commented 1 year ago

This could be covered by the proposed ASE converter see #3827