lab-cosmo / i-pi-dev_archive

Development version of i-PI
21 stars 12 forks source link

Removing units from `<file>` tag #172

Closed grhawk closed 6 years ago

grhawk commented 7 years ago

Following #162, the idea is to remove units from the file tag to avoid confusion. The units specification can only be done into the xyz or pdb files.

OndrejMarsalek commented 7 years ago

I would like to respectfully but strongly oppose the use of mutated versions of standard formats like PDB. With the recent (or "recent", I should have got involved earlier, sorry) changes, we are in a situation where a correct to-specification PDB file will be interpreted incorrectly by i-PI, to confusing and potentially harmful results. Units are strictly a part of the format. What is needed instead is a file in a new custom format that is derived from PDB, is not PDB, but is still called PDB. I think that is very dangerous and should be avoided at all costs.

For XYZ, the situation is more complicated because there isn't a formal specification, so both units and box information unfortunately have some freedom. But even for that, there is for example the extended XYZ format used by several programs, see for example here or here.

I know that historically this has been treated relatively freely in i-PI, but I think especially if one wants to have users who are not directly related to the development of the program, respecting standard formats is important. I would consider using existing code from elsewhere or even whole libraries (perhaps optionally) and would be very happy to help us get there.

ceriottm commented 7 years ago

I think that now (as an exception to the general rule) PDB is interpreted in Angstrom. Can't remember if we still allow one to override units using the comment line, but I am generally in favor of dropping that option for PDB. My fear though is that every time we have touched I/O or units, we generated lots of headaches. So Ondrej, if you're willing to take responsibility for a units/I/O rehaul, I'd be very happy to support that and to discuss with you the general phylosophy. I like the libatom "extended xyz" format quite a bit. However what I don't want is that yet another change to the I/O is started and left in an unusable state in a branch, or even worse merged to master before every piece of code is compatible with it.

On 17 April 2017 at 11:09, Ondrej Marsalek notifications@github.com wrote:

I would like to respectfully but strongly oppose the use of mutated versions of standard formats like PDB. With the recent (or "recent", I should have got involved earlier, sorry) changes, we are in a situation where a correct to-specification PDB file will be interpreted incorrectly by i-PI, to confusing and potentially harmful results. Units are strictly a part of the format. What is needed instead is a file in a new custom format that is derived from PDB, is not PDB, but is still called PDB. I think that is very dangerous and should be avoided at all costs.

For XYZ, the situation is more complicated because there isn't a formal specification, so both units and box information unfortunately have some freedom. But even for that, there is for example the extended XYZ format used by several programs, see for example here https://camtools.cam.ac.uk/wiki/site/5b59f819-0806-4a4d-0046-bcad6b9ac70f/Extendedxyz.html or here https://gitlab.com/ase/ase/blob/master/ase/io/extxyz.py.

I know that historically this has been treated relatively freely in i-PI, but I think especially if one wants to have users who are not directly related to the development of the program, respecting standard formats is important. I would consider using existing code from elsewhere or even whole libraries (perhaps optionally) and would be very happy to help us get there.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/cosmo-epfl/i-pi-dev/issues/172#issuecomment-294436955, or mute the thread https://github.com/notifications/unsubscribe-auth/ABESZ2lK7ktS_Q2gjENXNLKNelU13ainks5rwyxLgaJpZM4MbXvo .

OndrejMarsalek commented 7 years ago

As far as I can tell, currently, a vanilla PDB will be read so that the numbers in it are interpreted as Bohr.

I agree that I/O changes have been painful, but I think that is still preferable to a painful experience every time a user tries to bring in non i-PI files. I suggest that we go over what the outstanding issues are and address as many of the smaller non-breaking fixes as possible before doing anything major. @grhawk collect many (most?, all?) of outstanding I/O-related issues in #151.

grhawk commented 7 years ago

@OndrejMarsalek A vanilla pseudo-pdb (only compatible fields) should be treated in the right way. If it is treated assuming bohr as units, that is a bug. In that case, could you send me input and output of your example?

About the #151, I think it contains, at least, the major things that should be done. It is pending, mostly because the idea was doing a major revision of the IO and just fixing thinks in a dirty way could be a waste of time while rewriting the IO stuff is anyway really time demanding.

OndrejMarsalek commented 7 years ago

Reported the issue in #179.

Let's center all the effort to clean it up, eventually, around #151, then.

ceriottm commented 7 years ago

Hi @OndrejMarsalek I think this is connected with the final fix of the units conundrum. I think in the new branch things are quite clear(er) - you can specify units in a PDB file, and/or in the xml field, but the default - specific to PDB - is that we are reading angstrom. Also, a warning (or error) is raised if one tries to use PDB for anything but the positions. happy with this?

venkatkapil24 commented 6 years ago

I think this can be closed since we decided to keep the units in the file tag and the input file. If they are incompatible an error is raised.