XSF file from MD lacks ANIMSTEPS

ajjackson commented 4 years ago

Hi, I'm excited to see this public release of CONQUEST and try it out!

I tried building the code and working through the basic tutorials from the ReadTheDocs pages. After the methane MD example (https://conquest.readthedocs.io/en/latest/examples.html#simple-molecular-dynamics) I tried to open the output XSF file with ASE and found that it would only open the first step of the trajectory.

This is because the ASE XSF reader looks for an ANIMSTEPS entry on the first line of the file, as specified in the XCrySDen documentation (http://www.xcrysden.org/doc/XSF.html#__toc__6). When I run the methane example I get a trajectory.xsf file with a blank first line.

Is this intended behaviour for CONQUEST?

Paraquat commented 4 years ago

I added the .xsf output to visualise the trajectory on VMD, which apparently doesn't require the ANIMSTEPS keyword. The problem is that trajectory steps are appended to the .xsf file as the simulation runs, so the ANIMSTEP line must be added at the end (assuming the simulation terminated gracefully). I'm not if it's possible to write prepend a line to a file in Fortran without copying the file, but I don't see why we couldn't add that --- unless somebody knows a better way of doing it. Any idea @davidbowler @tsuyoshi38 ?

This may be the kind of thing that's easier to do with a python file handler.

davidbowler commented 4 years ago

We could add the number of expected MD steps from the input, which would seem the easiest thing, or add something like ANIMSTEPS XXX so that a single search-and-replace after the run would be trivial - I think that either of those would perhaps be best.

ajjackson commented 4 years ago

The former approach doesn't work for trajectories from geometry optimisation, which (on a good day!) will terminate in fewer steps than the maximum.

davidbowler commented 4 years ago

I looked into whether F2003 stream access would allow us to do this, but sadly not. I can see three ways forward:

(Simplest) Add the ANIMSTEPS XXX line and ask users to replace
Keep the trajectory record in memory and write once at the end (I don't like this - it's vulnerable to crashes)
Do option 1 but at the end of a run, on ionode only, read the trajectory file line-by-line and write out to a new file which starts with the appropriate ANIMSTEPS value

Any opinions?

davidbowler commented 4 years ago

@tsuyoshi38 @lionelalexandre Any ideas or comments here?

Paraquat commented 4 years ago

My preference would be 3. The only downside is that the .xsf file can be rather large (we're talking several GB), but otherwise we're technically breaking the .xsf specification, which is something we definitely shouldn't do.

davidbowler commented 4 years ago

If the .xsf file is as large as 3GB then I think that using Conquest to do a job that can be done with sed or a stand-alone utility makes very little sense (particularly if we are using a large amount of an HPC facility). It would make more sense to me to have a script or stand-alone utility to replace the XXX with the appropriate number post hoc.

tsuyoshi38 commented 4 years ago

I agree with Dave. It sounds much better to have a stand-alone utility which modifies the XSF file after the CONQUEST job. (using a front-end machine or our own PC.)

lionelalexandre commented 4 years ago

Dear All,

well, I think post-proc on top of the Conquest MD/geom. opt. is the best solution, assuming that some dump files still exists (names CoordForce.dat for some time maybe this has changed...). Concerning post-proc, there is already Zamaan's python tool for this (I think ; maybe this conversion to xsf from Conquest MD output files can implemented into). There exists also a post-proc Fortran code which can do the job 'cq_dynpro' but this need some work to includes xsf conversion.

Lionel

Le mar. 11 févr. 2020 à 14:53, David Bowler notifications@github.com a écrit :

I looked into whether F2003 stream access would allow us to do this, but sadly not. I can see three ways forward:

(Simplest) Add the ANIMSTEPS XXX line and ask users to replace

Keep the trajectory record in memory and write once at the end (I don't like this - it's vulnerable to crashes)

Do option 1 but at the end of a run, on ionode only, read the trajectory file line-by-line and write out to a new file which starts with the appropriate ANIMSTEPS value

Any opinions?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/OrderN/CONQUEST-release/issues/39?email_source=notifications&email_token=ADIMB77OLMSZY6RQCYIBOATRCKUWBA5CNFSM4KROVEZ2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELMPOVI#issuecomment-584644437, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADIMB72VXDS2UAAHECWW67TRCKUWBANCNFSM4KROVEZQ .

--

Dr. Lionel A. Truflandier

The PCCP Master is opening in Bordeaux! http://masterpccpbordeaux.wix.com/pccp

Theoretical Chemistry & Modeling Group Institut des Sciences Moléculaires Université Bordeaux 351 Cours de la Libération 33405 Talence, France

+33(0)5 4000 2794 l.truflandier@ism.u-bordeaux1.fr lionel.truflandier@gmail.com http://www.lionelat.com

ajjackson commented 4 years ago

It seems worth raising option 4 (for discussion, at least): instead of XSF, use a trajectory format that supports appending, such as extended XYZ. If post-processing is to be used to reach a compliant XSF file, it might as well operate on something that is 'correct' for another standard - provide a conversion tool from something else to XSF.

Option 5 (a bit far-fetched): lobby for this value to be made optional in the XSF specification. It isn't technically necessary and makes the format less useful. Unfortunately this requires a number of projects (not VMD, but at minimum XCrysDen and ASE, probably more) to update their parsers. The format still gets fragmented between old and new.

The problem with a post-processing step is that incomplete runs will miss it (or VMD users will skip it) and then the ecosystem will be filled with non-compliant XSF files. Parsers like ASE's will still need to add special logic for XSF files from Conquest, even if that logic just leads to a user warning to run the tool.

I'm not familiar with the details of Conquest's development and implementation, so perhaps there is already too much inertia in this format. But from an outside perspective, with an interest in the wider atomistic ecosystem, option 4 is the best. I appreciate that it is also the most demanding in terms of developer time.

Paraquat commented 4 years ago

There is no inertia in .xsf, it's something I unilaterally implemented because

you can define the lattice vectors
you can store a trajectory
it works with VMD

For some reason I can't access the VMD website right now, so I can't confirm whether this extended xyz format can be loaded, but here is what I propose: get rid of the .xsf output from Conquest, and I will write a script that will convert Conquest's internal format to .xyz in a postprocessing step. I would also like to implement a more universal format that accepts appending, and will look into this exyz format.

ajjackson commented 4 years ago

It is frustrating that XYZ is itself fragmented, as you can't tell from the extension whether you are looking at XYZ or extended XYZ. Tinker also has a variation on XYZ. Still, a few projects are using a semi-standardised ext-xyz: this page may be useful:https://libatoms.github.io/QUIP/io.html#extendedxyz I think jmol also handles it quite well.

It looks like VMD's XYZ support is a bit wobbly as it doesn't tolerate the comment lines used to add extended data. XSF is still probably the best bet there! https://www.ks.uiuc.edu/Research/vmd/plugins/molfile/xyzplugin.html

It does seem that we are lacking a good community standard format that is convenient, general and well-defined. As these goals conflict, there is some sense in focusing on "convenient" while writing output and then converting to "well-defined".

davidbowler commented 1 year ago

Extended XYZ format was implemented (following ASE) in #102 and subsequent (small) corrections. It can be set with AtomMove.WriteExtXYZ T

OrderN / CONQUEST-release

XSF file from MD lacks ANIMSTEPS #39