ModellingWebLab / project_issues

An issues-only repository for issues that cut across multiple repositories
1 stars 0 forks source link

Epic 4: Develop fitting spec & implement it using FC+PINTS #60

Open jonc125 opened 5 years ago

jonc125 commented 5 years ago

(Updated) to-do list

  1. 74 Design a Python/PINTS based "fitting spec" that will work for a sine wave fit, i.e. some kind of interface through which a python script can:

    • Specify required protocol outputs
    • [ ] could be ontology term + units,
    • [ ] but what about post-processed "columns", e.g. tau vs V
    • [ ] #63 obtain data (CSV columns) loaded by FC for the specified outputs
    • [ ] obtain a runnable "protocol" object, that provides the specified outputs
    • [ ] specify required model parameters / adjustables
    • Create boundaries or priors
    • [ ] on the parameters
    • [ ] on other model variables (e.g. rates)
    • Tweak simulation properties
    • [ ] solver tolerance
    • [ ] random seed?
    • [ ] define an ErrorMeasure or LogPDF in PINTS
    • Run an optimisation or inference problem and store the results
    • [ ] Create a Controller, passing in the required method
    • [ ] Tweak the method, if required! E.g. controller.sampler().set_special_setting(123)
    • [ ] Run an somehow get results that WL can interpret again
  2. Create something to run the fits
  3. Update WL front-end for fitting

Things that need to be captured by a fitting spec are:

  1. Fitting method, priors, noise model
    • Give (distribution+) bounds on e.g. "a rate parameter" "for" "Ikr", "b rate parameter" "for" "Ikr"
    • The "for" "Ikr" bit will be done by filtering using the dependency tree, rather than annotating all parameters by which current they're for. So you annotate all "a rate parameter" variables, etc., and the Ikr variable, then use the extended dependencies for Ikr to filter all "a" variables to just the ones of interest.
    • Need to check which bqbiol predicate to use for “a rate parameter”: is? isVersionOf? hasProperty? isProperty? Something else?
    • Probably safer to have 4 terms: "forward rate a parameter", "forward rate b parameter", "backward rate a parameter", "backward rate b parameter"
  2. Mapping from prediction output to dataset column - possibly automatic if prediction outputs are annotated? (i.e. add oxmeta annotations for protocol output specifications)
  3. Boundaries / constraints
    • E.g. say all "forward rate" "for" "Ikr" should be in [k_min, k_max]
      • So need “forward rate” and “backward rate” terms
  4. Optional RNG seed?

See also #74.

MichaelClerx commented 4 years ago

~old to-do list was here~

MichaelClerx commented 4 years ago

Tests:

jonc125 commented 4 years ago

To discuss: how much we want a spec 'language' for fitting (perhaps just a config file essentially, but supporting comments unlike JSON!), and how much just allow pints code. Cf sandboxing discussion in #61.

mirams commented 4 years ago

I think I am in favour of just pints code (I thought that was what we concluded in last face to face meeting). Please feel free to edit this list of pros and cons:

Pros:

Cons:

jonc125 commented 4 years ago

I've added some thoughts on what a fitting spec needs to capture in the issue description - please edit!

MichaelClerx commented 4 years ago

Let's have another chat about this via skype then? I think I'm also in favour of fitting scripts. It gets complex really fast

MichaelClerx commented 4 years ago

@jonc125 I've updated the top post above with a tentative to-do. Now wondering about the approach for running fits:

FC as a simulation engine?

FC as an everything engine

Something like this? Neither?

jonc125 commented 4 years ago

Either approach should work, depending on how much you want separate libraries. The fitting runner would need to know a fair bit about FC, so it might make more sense to combine them.

There will be some improvements to fc-runner in any case to use the new weblab-fc protocol parser as a library to extract the protocol interface (now needing to include outputs as well - so this bit would overlap with a fitting use case) and send it to the Web Lab front-end when new protocols are uploaded. Currently it uses a rather hacky bit of code partially parsing the protocol! It will also then need to use cellmlmanip instead of pycml to determine model/protocol compatibility - again overlapping with processing needed for fitting. So these common features would sit either in cellmlmanip or weblab-fc to be used by other components.

MichaelClerx commented 4 years ago

Or should we define some common language? Some kind of manifest file that everyone can read that tells you what an entity is, what it needs, etc.? Would open the door to future additions!

MichaelClerx commented 4 years ago

(where by "language" I mean one or two standardised fields in an xml or json file)

jonc125 commented 4 years ago

Well, we could, but for models & protocols this information is already defined internally to those documents (or associated RDF files for models in due course). So you'd still need some software to extract it from there to a new common format.

MichaelClerx commented 4 years ago

For models it's going to end up separately though, presumably in a COMBINE archive. So it'd make sense for every component (model, protocol, data, fitting spec) to be a COMBINE archive too?

jonc125 commented 4 years ago

They already are COMBINE archives. Doesn't change the fact that for e.g. protocols the canonical list of outputs is in the protocol file itself, and would need to be copied into another file in the archive.

MichaelClerx commented 4 years ago

Yeah. I suppose parsing just the "most interfacy" bits of the model interface section isn't hard though, so might be good to have this outside of FC. I really like the idea of having it modular (even if just in principle), so that we could theoretically support other FC implementations (e.g. other domains, other simulation types)