Closed FFroehlich closed 5 years ago
One side-goal could also be to avoid repeated forward integration in case of multiple replicates (same dynamical parameters) by passing multiple ones at the same time to AMICI. This would accelerate the adjoint approach. Since we usually have few repetitions, the impact of this would however be limited.
Regarding the issue of repeated forward integration, this could probably implemented by simply storing final x and sx [if available] for every condition vector. If we store this in a hashtable (hash from ExpData object), we could dynamically check if respective x and sx have been computed and or whether presimulation has to be performed. The question is where this hashtable should be stored. Although it does not 100% belong there, the Model
class is probably the best place to store it, as it can persist across multiple calls to runAmiciSimulation (this would work before we actually implement vectorization of ExpData).
If we cache that in Model, it would have to be invalidated on changes of/to Solver. This looks quite difficult (and undesirable) to me. I would only reuse during a single call to runAmiciSimulation.
I agree, lets not put this in Model
but create this in amici.cpp
.
Are there already any ideas/efforts for allowing replicate measurements? (Same timepoint, same observable, different value) I think this could be quite helpful.
One side-goal could also be to avoid repeated forward integration in case of multiple replicates (same dynamical parameters) by passing multiple ones at the same time to AMICI.
@yannikschaelte You are suggesting to handle that via arrays of ExpData?
Actually, replicate measurements, i.e., multiple datapoints for the same timepoint, are already allowed and should work just fine :)
:flushed:
but the loop is done in python, right?
I meant C++, without resimulation. Is this already possible?
cause having the loop in c++, one can also speed up things by doing just one forward simulation with adjoints
I think so, simply have the same timepoint in edata.ts twice. For the second timepoint nextTimepoint > model->t0()
will be false and handleDataPoint(it)
will be called without additional simulation.
I also just figured this out a couple of days ago when specifying an according edata by accident 😅
You are suggesting to handle that via arrays of ExpData?
would be easiest
I did not check correctness of adjoint sensitivities yet, but even if it does not work yet, it should be easy to fix.
Ah, since ExpData::timpoints overrides Model::timepoints. Okay. Good to know :) :+1:
I think having everything in a single ExpData instance is way easier than weaving together multiple ReturnData results from an array of ExpData (although we might have to that at some point anyways)
Maybe worth adding a note in ExpData, that timepoints have to be sorted but don't have to be unique.
I think having everything in a single ExpData instance is way easier than weaving together multiple ReturnData results from an array of ExpData.
Totally agreed. Question would have been if we want something like we apparently already have, or if we want to add a replicate-dimension. The current solution is sparser.
formally this would afaik involve adding an additional dimension to each (return-)data array (nt, nr, ny). you suggest just flattening this (data and simulations alike) to (nt*nr, ny)?
Yes. Where nr currently can be time point specific
even more flexible :) sounds good, so looks like in the input one just has to flatten/re-arrange the data array, and as @FFroehlich pointed out make sure the sensitivities are correct
to implement hierarchical optimization then, this format demands some nasty index shuffling, but should be not too difficult with the approach from canpathpro?
Very good point. That won't work. Firstly, so far we assume same offset/scaling/sigma parameters for all timepoints (would be straightforward to extent). Secondly, we assume all conditions to have the same number of timepoints (so here sparsity would be over). Need to think how to best manage that.
I think for specifying the data, the most convenient would be a dense (nt, ny, nr) matrix, but for large models it would be good if that would be stored in a sparser manner.
We want to extend the ExpData class