ICB-DCM / pyPESTO

python Parameter EStimation TOolbox
https://pypesto.readthedocs.io
BSD 3-Clause "New" or "Revised" License
217 stars 47 forks source link

Added a lazy result object. #1421

Closed PaulJonasJost closed 2 months ago

PaulJonasJost commented 3 months ago

Added a result object, that reads the results only if needed. Connected to #1405

Things to consider still

codecov-commenter commented 3 months ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 73.20261% with 41 lines in your changes missing coverage. Please review.

Project coverage is 83.24%. Comparing base (e746456) to head (6f76f46).

Files Patch % Lines
pypesto/result/optimize.py 71.72% 41 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## develop #1421 +/- ## =========================================== - Coverage 83.36% 83.24% -0.13% =========================================== Files 161 161 Lines 13247 13396 +149 =========================================== + Hits 11043 11151 +108 - Misses 2204 2245 +41 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

dweindl commented 3 months ago
* [ ]  Do we want a casing limit, if yes how would that look like?

Not sure what you mean.

* [ ]  Should the lazy result actually be a subclass of the optimizer result?

I'd say so.

Something to be discussed: Should changes to the result just affect the OptimizerResult instance or should they be written to file?

* [ ]  Connected with the above, when printing the lazy result, all attributes are shown as `None`

Why aren't they loaded on demand then?

PaulJonasJost commented 3 months ago

Not sure what you mean.

Sorry meant caching. If we load very big hessians, we might want to throw them out later again? 🤔

Something to be discussed: Should changes to the result just affect the OptimizerResult instance or should they be written to file?

That's an interesting thought. I would say per default no, as it would not be a result object created by optimization but changed thereafter. But a function would probably be good to propagate changes to the file?

Why aren't they loaded on demand then?

Not entirely sure, it might be because in the OptimizerResult they are attributes while here they are properties? And so since LazyResult is a subclass, it will get initiated with those attributes as well 🤔 When calling optimizer_result.id it gives back the appropriate value, just not in the __repr__/print

dilpath commented 3 months ago

I have been using hdfdict [1] with pydantic objects, no issues so far in my basic use cases (including numpy arrays), which has lazy loading on by default. Hopefully the lazy loading persists when a pydantic data object is reconstructed from HDF5 via hdfdict, but I haven't tested this yet.

So one alternative to this PR would be to refactor our results to be pydantic classes. I guess this might anyway occur when PEtab Result is supported in pyPESTO.

[1] https://github.com/SiggiGue/hdfdict/blob/master/hdfdict/hdfdict.py

PaulJonasJost commented 2 months ago

@dilpath as we talked about, I think it makes sense to merge it in and when petabResult is out, we might anyways make it obsolete? In that case i think this as an intermediate step should be fine.

dilpath commented 2 months ago

Yes, fine for me :+1: