Test changes for loading xlsx NECC

nomad-hzb / nomad-chemical-energy

Apache License 2.0

1 stars 1 forks source link

Test changes for loading xlsx NECC #10

Open RoteKekse opened 1 month ago

RoteKekse commented 1 month ago

I cut the loading time by 3. Can you have a look: https://github.com/nomad-hzb/nomad-chemical-energy/blob/eddff07ac525be5a4535f27e80f99fec397e6d7e/ce_necc/schema.py#L106

i also moved the parsing in the plugin, i will do this with the others too

RoteKekse commented 1 month ago

We should definitely test more the time it needs to parse also when saving some notes eg in description

RoteKekse commented 1 month ago

maybe have a checkbox which is checked after parsing?

carla-terboven commented 1 month ago

I cut the loading time by 3. Can you have a look:

https://github.com/nomad-hzb/nomad-chemical-energy/blob/eddff07ac525be5a4535f27e80f99fec397e6d7e/ce_necc/schema.py#L106

i also moved the parsing in the plugin, i will do this with the others too

It is nice that it works but I don't understand why. Don't we need different header numbers for e.g. potentiometry and thermocouple? Because their headlines are defined in different rows in the excel?

RoteKekse commented 1 month ago

https://github.com/nomad-hzb/nomad-chemical-energy/blob/7ee8dba16082311cb93eed845714dcb7bf53f6d4/ce_necc/schema.py#L138

There i reset the header and then in the call i dont pass the full data frame

so the 3rd row is the 2nd row in the parsed file and then data.iloc[2:] selects the data