Dictionary of edatas instead of list?

ICB-DCM / pyPESTO

python Parameter EStimation TOolbox

https://pypesto.readthedocs.io

BSD 3-Clause "New" or "Revised" License

211 stars 47 forks source link

Dictionary of edatas instead of list? #171

Open yannikschaelte opened 5 years ago

yannikschaelte commented 5 years ago

For reverse engineering and keeping track of which data points belong to which simulation conditions, it might make sense to replace the edatas array created in https://github.com/ICB-DCM/pyPESTO/blob/master/pypesto/objective/petab_import.py#L241 by a dictionary. Would also get rid of funny indices. Any opinions?

dweindl commented 5 years ago

No strong opinion on that. What would be the key then?

yannikschaelte commented 5 years ago

the identifier of the condition (i.e. simulation + preequilibration condition)?

dweindl commented 5 years ago

Works for me. But this has to be unique then. The latter may cause problems with https://github.com/ICB-DCM/PEtab/issues/125

paulstapor commented 5 years ago

Well, after pickling this stuff together for a while, it seems to be possible to reverse engineer these things right now already... Need to double-check it... Wouldn't it be possible to add such a dict to petab.importer-object? As a very simple dataframe, which has the same indexing as the edatas?

FFroehlich commented 5 years ago

it seems to be possible to reverse engineer these things right now already... Need to double-check it...

That sounds like a simpler implementation might be easier to maintain.

yannikschaelte commented 5 years ago

definitely possible, not sure what's the best design. don't necessarily want to rely on keeping in sync several objects if one is updated.

dweindl commented 5 years ago

For reverse engineering and keeping track of which data points belong to which simulation conditions [...]

Can you shortly explain the rationale for doing that?

yannikschaelte commented 5 years ago

For one, it could come in handy when writing back the rdatas to the petab measurement file format. There, we currently repeat the steps of grouping data as in the edata generation process, to ensure we have the same structure. Not sure though if a dict would completely resolve that though.

For other uses: I think @paulstapor had some trouble mapping indices to simulation conditions for (mini-batch) loads of experimental conditions, to identify sources of error?

dweindl commented 5 years ago

Ok, for that (preequilibration_condition_id, simulation_condition_id) sounds reasonable. But see comment above, we need to think of a better way to handle https://github.com/ICB-DCM/PEtab/issues/125 then.