octue / octue-sdk-python

The python SDK for @Octue services and digital twins.
https://octue.com
Other
9 stars 4 forks source link

Speed up manifest, dataset, and datafile instantiation/validation #561

Open cortadocodes opened 1 year ago

cortadocodes commented 1 year ago

If metadata is required for datafiles, datasets, and manifests, it can take quite a while to instantiate datasets of many files. This is leading to users changing their data schema to avoid file validation via the twine.

Here are the results of profiling the instantiation of 5000 datafiles from @time-trader: image

I have several ideas for speeding this up:

cortadocodes commented 1 year ago

This profiling graph was also provided: UQ