FAIRmat-NFDI / nomad-ML-generate-smiles-reader

This is an reader to create a NOMAD archive JSON file from the Generic JSON Writer in the 3DS BIOVIA Pipeline Pilot software.
Apache License 2.0
0 stars 0 forks source link

How to define custom type in scheme? #4

Open tmusho opened 3 months ago

tmusho commented 3 months ago

@JosePizarro3 can you help me figure out how to define a custom type in the scheme. The data looks like this: 0 0.000000 0.000000 0.000000 1 2 3.069030 403.971311 0.000000 3

Here is my attempt. I reverted to just reading a string in. https://github.com/FAIRmat-NFDI/nomad-ML-generate-smiles-reader/blob/ad6205a127a918a44983a40feea85706e24635a5/src/nomad_ML_smiles_reader/schema.py#L224

JosePizarro3 commented 3 months ago

Sure, I will check 🙂 I am just leaving a comment for you, let me know if something is not clear.

At first sight, I would break each column into its own Quantity. In the end, Quantities are just the field of what you would have in a JSON (or the dataset if you would use HDF5, as another example). Furthermore, I can imagine you can classify by defining a SubSection to group all columns data as Quantities; something like:

class UVVisData(ArchiveSection):
    """
    A base section used to define the UV-Vis data quantities.
    """

    excited_state_number = Quantity(
        type = int,
        description="""
        The excited state number as an integer...
        """

    frequency = Quantity(
        type = np.float64,
        unit='THz',  # check if this units are correct :-)
        description="""
        The excited state number as an integer...
        """

     # Other columns as quantities defined here.

class ModelData(Entity):
    """
    A base section used to specify the system solver information used for simulations.
    """
    ...

    uv_vis_data = SubSection(sub_section=UVVisData.m_def, repeats=False)

Note the repeats for the SubSection, in case this is a list of UVVisData, repeats should be changed to True.