nomad-coe / nomad

NOMAD lets you manage and share your materials science data in a way that makes it truly useful to you, your group, and the community.
https://nomad-lab.eu
Apache License 2.0
64 stars 14 forks source link

output .cif files and XRD calculation #28

Closed Pepe-Marquez closed 11 months ago

Pepe-Marquez commented 2 years ago

It would be great to be able to output the atomic structure information as .cif files. This is the crystallography standard and allows to use of the structures in many other applications. A direct download option from the UI would be very nice to have.

https://www.iucr.org/resources/cif/spec/version1.1/cifsyntax#charset%5D

pymatgen has some converters written for the MP structures, just in case it helps: https://pymatgen.org/pymatgen.io.cif.html

markus1978 commented 2 years ago

Thanks for your suggestion. We will consider it, as soon as we have the resources.

markus1978 commented 2 years ago

@lauri-codes Can you briefly comment on this. Did we had this in the encyclopaedia already? How would you implement this. Server-side vs client-side; on-the-fly vs. during processing.

@Pepe-Marquez There were other ideas on the table? I remember "diffraction" something generated from crystals that can be calculated "easily".

lauri-codes commented 2 years ago

Hi @Pepe-Marquez, @markus1978,

I think this is a very good idea and would have plenty of use cases. I don't remember us having a structure file download option in the Encyclopedia.

If we would be talking just about CIF, then a client-side implementation on-the-fly would be a solution. But I can imagine that we will very quickly need to support several other formats (.xyz, .pdb, POSCAR, etc.), and as Markus mentioned the same principle could be applied also to other properties besides atomic structures (diffraction, descriptors, etc.). In this case, the client-side implementation is no longer a realistic possibility.

There are very good python libraries (e.g. pymatgen as Pepe mentioned) for calculating these properties, so it would be natural to do this server-side. As we are talking about data that can be fairly big, not too expensive to calculate, and relative infrequently accessed, to me the natural solution would be to calculate these on-the-fly with a specific API endpoint. I could imagine specific GUI elements on our overview page and ArchiveBrowser having a dropdown menu for downloading the files. E.g. in the material card, we would have a menu item for a CIF file.

Implementation-wise this should be relatively easy. There are just a bunch of design questions that would need to be solved first: how would the API routes be structured, how do we consistently determine which properties should be stored in the Archive and which should be calculated on-the-fly?

Pepe-Marquez commented 2 years ago

XRD (X-ray diffraction patterns) can be calculated from the structure. The typical way to use seems to be using the computational crystallography toolbox: https://cctbx.github.io/.

An example of how to do this can be found for example here, in which their input is a .cif file: https://github.com/maffettone/xca/blob/main/xca/data_synthesis/cctbx.py

pymatgen also has implemented something similar from their structure files: https://pymatgen.org/pymatgen.analysis.diffraction.html

I think this would be very cool to envision AI application to do auto phase assessment in experimental data with the NOMAD theoretical database for new compounds. https://www.nature.com/articles/s43588-021-00059-2

@lauri-codes, I think support for .cif should be more than enough and there are plenty of converters available in other applications. for example, for the ICSD and MP the typical output formats are .cif files.

lauri-codes commented 11 months ago

Structure file download is now possible.