ModellingWebLab / project_issues

An issues-only repository for issues that cut across multiple repositories
1 stars 0 forks source link

Fitting spec design #74

Open MichaelClerx opened 4 years ago

MichaelClerx commented 4 years ago

Latest thinking at https://docs.google.com/document/d/1B1TLdeOMD44hwb7Cb4d09GOfHRQmElZJS5e3s6N2ccM


Hi all,

Below is my idea of what a fitting spec (script) might look like.

For context, an example of fitting with PINTS is found here, with lots more examples here.

PINTS interacts with models through the ForwardModel class, and with model+data pairs in the Single|MultiOutputProblem classes.

Base class

The user implements this class:

class FittingSpec(object):
    """
    Abstract base class for all fitting specifications.
    """

    @staticmethod
    def output():
        """
        Returns the variable to fit to, as an oxmeta annotation.

        For example ``membrane_rapid_delayed_rectifier_potassium_current``.
        """
        raise NotImplementedError

    @staticmethod
    def parameters():
        """
        Returns a list where each entry specifies a type of parameter for the
        :meth:`output()` variable, as an annotation.

        For example::

            return [
                'conductance',
                'forward_a',
                'forward_b',
                'backward_a',
                'backward_b',
            ]
        """
        raise NotImplementedError

    @staticmethod
    def fit(problem, parameter_counts):
        """
        Performs a fit on the given ``problem`` and returns a vector of
        obtained parameters.

        The problem to optimise is given by ``problem``, which is either a
        ``pints.SingleSeriesProblem`` or a ``pints.MultiSeriesProblem``.
        It contains both the model to fit and the data to compare to.

        A list of ``parameter_counts`` is provided that lists the number of
        parameters found in the CellML model of each type given in
        :meth:`parameters()`.
        The list has the same ordering as the list returned by 
        :meth:`parameters()`, so that its i-th entry provides the number of
        parameters of type ``parameters()[i]``.

        Returns a list (or any other sequence type) of obtained parameter
        values.
        """
        raise NotImplementedError

Weblab implementation

The weblab backend then implements some bit of python code that loads FC, scans a CellML model for parameters matching the annotations, wraps a call to FC in a pints.ForwardModel, constructs a Problem, and calls fit(). All of this is then run in some kind of sandboxed environment

The weblab frontend has some stand-alone bit of Python code that basically does this:

To analyse a spec it runs the above code in a subprocess and reads (and deletes) the output file. (Doing this in a subprocess stops syntax errors in fitting specs from breaking the front-end).

mirams commented 4 years ago

Looks a good start. I was thinking that the "what to fit to" probably wants to be anything that is in the "outputs" section of a protocol. This could be an outputted metadata-tagged current through time (which will work for Use Case 1), but more generically could be something that doesn't have a metadata tag in the CellML, like a post-processed summary IV curve, or normalised current, or something like that?

MichaelClerx commented 4 years ago

Hadn't thought of that! But yes that's a good point

MichaelClerx commented 4 years ago

Although the parameters can't be inputs because then you'd need to list them all explicitly again?

jonc125 commented 4 years ago

I like the idea that the front-end can call methods to figure out the ontology-level interface for the spec. That's neat.

I'd assume 'output' would be the time series (possibly more than one?) that you're going to compare between data & prediction. The dataset columns will be annotated, and we'll also need to annotate protocol output specs, so just giving one (or more) ontology term would allow matching these up - and the script can decide what metric it wants to use to compare series.

For inputs, we probably want it a bit more structured, so you can restrict to specific current(s) etc? So they wouldn't (necessarily) be single ontology terms, but might be composites to specify 'forward_a' for 'IKr' - which could just be represented as a pair?

mirams commented 4 years ago

Since the fitting spec is linked to one particular protocol, we probably don't need to bother with metadata on the outputs, but can just ask for the variables of interest by name (e.g. "APD" and "DI". A manifest lists them and their units etc. too, so you just need to be able to get hold of the dataset in the relevant units to fit against): e.g. https://scrambler.cs.ox.ac.uk/experiments/2404/versions/2519/outputs-contents.csv/displayContent

If you want that bit of linking to dataset to be automatic, I guess something else needs doing. But we already link a dataset to a protocol. So maybe we should be linking a dataset to a protocol output when we upload the data? How do we decide what bit of protocol to plot data against at the moment?

jonc125 commented 4 years ago

Aidan's prototype basically did use the output names, with a map in the fitting spec between protocol output names and dataset column names. I wanted to extend this for WL2 so it uses the same mechanism as for model-protocol linking, i.e. ontology annotations on both.

mirams commented 4 years ago

My hunch is that it's easier to align the dataset with a protocol output when you upload, as these aren't always going to be "things" with names in the same way as bits of model usually are. For instance you'll have people having to make up metadata terms on the fly for "voltage under 2Hz paving with Ko=4mM, Nao=100mM" and suchlike

jonc125 commented 4 years ago

That's an idea actually. So the 'annotation' step for a dataset would actually be a lot simpler than for models (and use entirely different code): just select which of the associated protocol's outputs corresponds to each column, and the units. And then whole-dataset annotations for species etc?

MichaelClerx commented 4 years ago

Initially we'll ignore sandboxing / security, and only give ourselves permission to add fitting specs.

jonc125 commented 4 years ago

See also https://docs.google.com/document/d/1B1TLdeOMD44hwb7Cb4d09GOfHRQmElZJS5e3s6N2ccM/edit#