siesta-project / aiida_siesta_plugin

Source code for the AiiDA-Siesta package (plugin and workflows). See wiki
Other
6 stars 11 forks source link

Ion files produced by aiida_siesta incompatible with SIESTA when using CDF.save true #126

Open ahkole opened 10 months ago

ahkole commented 10 months ago

I discovered a rather obscure bug. If I try to re-use the basis set from a previous calculation by using the ion files, i.e.

builder.ions = calculation.output.ion_files

SIESTA crashes if you also have turned on the writing of the NetCDF output file (CDF.save true). After doing some digging my hypothesis is that this is caused by the way that aiida_siesta is writing the ion files. As an example, see here a snippet of an ion file produced by SIESTA and the corresponding one created by aiida_siesta:

SIESTA:

</preamble>
Nb                            # Symbol
Nb                            # Label
   41                         # Atomic number
     13.0000000000                             # Valence charge
     92.9100000000        # Mass 
     210.068680882        # Self energy 

aiida_siesta:

</preamble>
 Nb 
 Nb 
                        41 
      13.0000000000 
      92.9100000000 
      210.068680882 

The differences are very subtle and primarily in the spacing. I suspect that the problematic part is the difference in spacing in the writing of the atomic label. In the aiida_siesta ion file there is a leading space which causes the system label variable internally inside SIESTA to also have a leading space (i.e. species_label = ' Nb' inside SIESTA). When trying to create a NetCDF file the code then crashes because this species label is used to create a NetCDF variable but the names of these variables cannot start with leading whitespace. The reason for the space in the species label is probably because there are also spaces around the species label in the ion.xml file (which aiida_siesta uses to parse the ion data if I understand correctly). Two possible ways of fixing this would be:

  1. Inside aiida_siesta make sure to strip() the atomic labels before writing them to the ion file.
  2. Inside SIESTA make sure to also strip leading white space from labels when reading them from an ion file by using adjustl in addition to trim (i.e. trim(adjustl(species_label))).

I don't know which method is better/preferred. What do you think?

bosonie commented 10 months ago

Thanks for reporting the problem @ahkole. Let me start by saying that it is quite possible that the file written by aiida_siesta might have some small problems. This ion part is a feature that I did not use much. Said that, I have a preliminary comment. The only reason why we need to write that file is because, at the time of the implementation, Siesta could only read the ion file in ascii format and not in xml. Note that Siesta CAN WRITE ascii and xml, but could only read asii. Is it still the case? I'm not up to date with the recent Siesta improvements. Once we understand that, we can plan the next step, I'm sure it will be an easy fix.

ahkole commented 9 months ago

Thanks for the quick reply @bosonie. Looking into the latest version of the code it does not seem that siesta can read the xml file. It can read the NetCDF version of the file (ion.nc) though if it's available (but the documentation mentions that ghost atoms are not yet treated properly if you use the ion.nc files). I don't know if there's any plans on adding this in the future but I could post a question in the siesta GitLab if that would be useful?

Have you ever considered also adding support for taking the basis set info from the NetCDF file if they are available btw? As long as there are no ghost atoms this might have some benefits since there is no loss of precision because the data is stored in binary form. It does make things a bit more complicated when trying to re-use the basis set for a future calculation because you would have to have some way of determining if the code being run has NetCDF support when deciding if it's okay to copy the ion.nc files instead of .ion files.

ahkole commented 9 months ago

It does make things a bit more complicated when trying to re-use the basis set for a future calculation because you would have to have some way of determining if the code being run has NetCDF support when deciding if it's okay to copy the ion.nc files instead of .ion files.

I inquired with the AiiDA people (see https://aiida.discourse.group/t/how-to-store-which-features-a-code-was-compiled-with/197/5) and one way of doing it would be to store in the extras attribute of a Code if it has support for netCDF. The SiestaCalculation could then look for this property and if it exists and is true default to using netCDF files. But this is only relevant if you think it would be a good idea for adding support for using netCDF files for this.