LayTec EpiTT plugin - Githubissues

budschi commented 10 months ago

At the moment the LayTec EpiTT schema is developed for the MOVPE IKZ Ga2O3 Eperiment. However, it can be used for any exported LayTec EpiTT data in *.dat format.

Next steps:

Goal: Harmonizing the MOVPE IKZ Ga2O3 Eperiment schema and epiTT schema that would enable us to:

[x] (Automatic) referencing from epitTT measurement to the MOVPE process --> the ids in the epiTT *.dat have to match the process ID in NOMAD --> see #64
[x] (Automatic) referencing from epiTT measurement to the MOVPE sample --> that is a bit more complicated since the sample id is not in the *.dat file but should be retrievable via the process entry --> wafer number, modify MOVPE experiment accordingly --> see #64
[ ] develop a LayTec Epitt Experiment schema to handle multiple in-situ measurements of a MOVPE process:
[x] Make the epiTT schema a parser --> creates entries just by dropping the *.dat files in NOMAD
[x] seperate out file reader from schema.py (like in XRD plugin)
[ ] create an epiTT instrument entry + schema
[x] fix plolts whn plotly is fully in NOMAD
[ ] sample: later also reference to specific position on sample(wafer), e.g. center, edge --> from runtype settings

@aalbino2 @ThomasTSC

aalbino2 commented 7 months ago

we have some room for improvement:

[x] update plotting with plotly library
[x] separate Temperature and Reflectance plots
[x] separate each reflectance curve plot, make also a plot showing all together (or allow for interactive switching of curves in the plot if possible) check https://github.com/FAIRmat-NFDI/AreaA-data_modeling_and_schemas/blob/main/IKZ_plugin/src/ds_IKZ/schema.py https://github.com/FAIRmat-NFDI/AreaA-data_modeling_and_schemas/blob/main/IKZ_plugin/src/movpe_IKZ/schema.py
[x] from the raw file, parse the metadata field ##WAFER_LABEL and include it into filename and entry name
[x] harmonize classes names removing underscores and making them properly pascal case
[ ] add the post processing to clean from the noise (pick @ThomasTSC code)
[ ] add the post processing to simulate the curve from a phenomenological optical model (to be done as soon as @ThomasTSC will refine his code)
[x] modify the parser regarding file matching. At the moments it wants to parse all *dat files into the EpiTT schema

@budschi we can start with these points, together to yours:

[x] link the sample in the measurement
[x] please add here further points

aalbino2 commented 6 months ago

Hi @hampusnasstrom while fixing something around I saw basesections.py line 1333:

        archive.workflow2.outputs = [
            Link(name=result.name, section=result) for result in self.results
        ]

I got an Attribute Error because my class was an ArchiveSection instead of MeasurementResult and was hence missing the name quantity.

I propose to change it as follows, e. g., performing a very general check of the class before doing whatever to it.

If you have some MR open in gitlab, would you agree to add it? Otherwise I can open a new MR

        for result in self.results:
            if isinstance(result, MeasurementResult):
                archive.workflow2.outputs.append(Link(name=result.name, section=result))
            else:
                logger.error(f"{result} is not a MeasurementResult inheriting class.")

hampusnasstrom commented 6 months ago

Hrmm, shouldn't result always be MeasurementResult here? How did you manage to change it into something else?

    results = SubSection(
        section_def=MeasurementResult,
        description='''
        The result of the measurement.
        ''',
        repeats=True,
    )

aalbino2 commented 6 months ago

In this case, the code was written before the existence of the class. I propose this as a better error messages in case of errors when developing new code too

hampusnasstrom commented 6 months ago

I see. The problem is then that we would have to check all quantities and subsections to make sure they are of the specified type. I'm not sure if that is really good Python practice, or what do you think?

aalbino2 commented 6 months ago

I don't know which is the very best practice, I just though we want to be sure in this case that the class is the required one because we're gonna fish more quantities for further processing that may not give rise to errors when missing or stick more normalizers to that basesection and we want to be sure are run

hampusnasstrom commented 6 months ago

Sure, I'm just saying that this is only one out of very many times in basesections.py that we assume that a self.something is of the type we define. If we perform the check this time we should also add this check to all those other instances which might not be scalable.

aalbino2 commented 6 months ago

going further with the inheritance, as we do progressively more, makes tedious to go and check all the inheritance tree to sketch where the missing attribute originates from, especially when we jump importing from nomad or from other plugins, so it's convenient for development to facilitate and tell with an error (or warning) raise what is wrong, there are not so many normalizers so one could either:

go and see where such class check is needed
just leave the others like that and only fix it when some error occurs

I think a class check is not harmful, is it?

aalbino2 commented 6 months ago

@TLCFEM what do you think? Should we leave back the class check as a general strategy of coding or not?

TLCFEM commented 6 months ago

@TLCFEM what do you think? Should we leave back the class check as a general strategy of coding or not?

It can be done during assignment. I am not sure if it has been done or not (likely not).

At your end, I do not think you need to implement this. It shall be handled in the general, area-independent metainfo.py.

TLCFEM commented 6 months ago

If I remember correctly, we had a related discussion before. However, I cannot recall whether the fact that this check is not performed is a feature/bug. It could be intended to provide extra flexibility as in some certain cases, maybe one want to append a different section to the list.

So it may be risky to force such a check at the global level as it may break things. But I guess it is possible to introduce a flag in the definition such that

my_subsection = SubSection(section=target_section.m_def, validate_section=True)

When populating data, the check can be performed if validate_section is turned on. By such, at least in your area, you only need to add a flag to all related definitions, there is no need to manually check all usages, which is neither scalable nor visually appealing.

aalbino2 commented 6 months ago

Thanks for your comments!

aalbino2 commented 6 months ago

I summarize the points left and I will open a new issue for them:

[ ] develop a LayTec Epitt Experiment schema to handle multiple in-situ measurements of a MOVPE process
[ ] sample: later also reference to specific position on sample(wafer), e.g. center, edge --> from runtype settings
[ ] add the post processing to clean from the noise (pick @ThomasTSC code)
[ ] add the post processing to simulate the curve from a phenomenological optical model (to be done as soon as @ThomasTSC will refine his code)

FAIRmat-NFDI / AreaA-data_modeling_and_schemas

LayTec EpiTT plugin #65