nelder-mead fits - Githubissues

Jonas231 commented 6 years ago

Dear Robert, I found your packet a few days ago. Great work!! How would you use MOT to do Nelder-Mead fits of a few thousand curves simultaneously (each takes about 1 s)? I am asking because that's just my problem and I do not have a glue where to begin with your code, which is certainly a big help if I could apply it to fit. Lets say: I have 50000 datasets (each one a noise curve consisting of 12 values). I would want to minimize the quadratic deviations between this data and a fit model using Nelder-Mead. The fit model would have no more than 6 fit parameters. I would be very grateful for your help! Kind regards, Jonas

robbert-harms commented 6 years ago

Dear Jonas,

Thank you for the compliment, appreciated. Out of curiosity, in what (scientific) domain are you planning on using MOT?

Your use case indeed seems to be a perfect match for MOT. Although all functions and classes are documented, the manual is a bit sparse at the moment, I am working on it. For now, perhaps you can get started using the examples ( https://github.com/cbclab/MOT/tree/master/examples )?

To get you started a bit, I can give a short introduction to MOT. Since MOT is meant for high performance computing, it requires you to write some Python code and some C (OpenCL dialect of C) code. Using Python, you start by creating a new "model", representing your data and computations:

from mot.model_interfaces import OptimizeModelInterface

class MyModel(OptimizeModelInterface):
    ...

This model interface has a few methods you must implement for MOT to be able to use your model (see https://github.com/cbclab/MOT/blob/master/mot/model_interfaces.py ). The most important one of these is get_objective_per_observation_function():

from mot.model_interfaces import OptimizeModelInterface

class MyModel(OptimizeModelInterface):
    def get_objective_per_observation_function(self):
        func = '''
            mot_float_type my_function(mot_data_struct* data, const mot_float_type* const x, 
                                                         uint observation_index){

                mot_float_type estimate = ...
                mot_float_type observation = data->observations[observation_index];

                return pown(observation - estimate, 2);
            }
        '''
        return NameFunctionTuple('my_function', func)

This is where you add your model in OpenCL C. This C function receives as argument a data-structure containing additional data (which you will have to specify yourself using get_kernel_data()), the current parameter vector and the observation index. This function should return the (quadratic) residual for exactly one observation (given by the observation index). MOT will linearly sum all these return values in the background.

When you are done implementing the model, you create an instance of it, and give it to one of the optimization routines:

model = MyModel()
optimizer = NMSimplex()
starting_points = ...
opt_output = optimizer.minimize(model, starting_points)

What I would suggest you to do is start with one of the examples and adapt it to your purposes. You can use the printf() function in C to print some debug information while executing the kernels.

Do let me know how you fare,

Best,

Robbert

bjoern234 commented 5 years ago

Hi @robbert-harms ! I'm also stuck a little bit using your package with my own model which looks like that:

def model(X, V, p):
        a, b, c, d = p
        t = a * X + b * V ** 2 + c * V + d
        return t

where X, V and t are measured time-series data and a, b, c, d are the parameter I want to estimate. Your introduction above seems to be out of date because mot.model_interfaces is not available. How do I pass my observations? Do I have to define my model as a class like mentioned above?

Thank's in advance! Bjoern

robbert-harms commented 5 years ago

Hi Bjoern,

Yes, you are right, the previous answer has been superseded by a newer API.

In the newer version of MOT, you no longer need the model interface and all those extra model objects. In the new version objective functions are passed in as OpenCL functions and the data is loaded using special data containers.

Still, you will have to code your objective function in OpenCL for it to work. I took the liberty of writing some example code based on your model function:

import numpy as np
from mot.optimize import minimize
from mot.lib.cl_function import SimpleCLFunction
from mot.lib.kernel_data import Array, Struct

# How many unique data instances we have
nmr_problems = 10000

# length of the timeseries (per unique data instance)
nmr_datapoints = 16

# create the objective function
objective_function = SimpleCLFunction.from_string('''
    double my_model(local const mot_float_type* const x,    
                    void* data, 
                    local mot_float_type* objective_list){

        mot_float_type a, b, c, d;
        a = x[0];
        b = x[1];
        c = x[2];
        d = x[3];

        global float* X = ((model_data*)data)->X;
        global float* V = ((model_data*)data)->V;

        double sum = 0;
        double eval;

        for(uint i = 0; i < ''' + str(nmr_datapoints) + '''; i++){
            eval = a * X[i] + b * pown(V[i], 2) + c * V[i] + d;                
            sum += eval*eval;

            if(objective_list){
                objective_list[i] = eval;
            }
        }
        return sum;
    }
''')

# The optimization starting points
x0 = np.ones((nmr_problems, 4)) * 1

# Generate some random timeseries
X = np.random.rand(nmr_problems, nmr_datapoints)
V = np.random.rand(nmr_problems, nmr_datapoints)

# Prepare the data for use on the GPU's
kernel_data = Struct({'X': Array(X, 'float'),
                      'V': Array(V, 'float')}, 'model_data')

# Minimize the parameters of the model given the starting points.
opt_output = minimize(objective_function, x0, data=kernel_data)

# Print the output
print(opt_output['x'])

This probably does not completely do what you want to do, but it serves the purpose of showing you what goes where.

There are several things you need to be beware of when using MOT:

MOT is all about optimizing non-linear functions. You need to provide an objective function that accepts some parameters and returns an objective value to measure the goodness of fit.
MOT is meant to work on many problems in parallel using the "single function multiple data" paradigm. That is, we code a single objective function and pass in the data of multiple independent data points.
since MOT uses OpenCL, it requires that you code your objective function in OpenCL
to do model fitting, you need to formulate your model as an objective function, in the example above I use the sum of squares of your model function over time series.

Let me know if this is useful to you or not, or if you need more information.

Best wishes,

Robbert

bjoern234 commented 5 years ago

Hi Roobert, thank you very much for your extensive answer. This is much more as I've expected and will help me a lot. Unfortunately, I've some issues to install and run MOT on a NVIDIA GPU. It seems that MOT needs OPENCL_2.1 which is not supported at NVIDIA cards :(

robbert-harms commented 5 years ago

Hi Bjoern,

Can you give me a copy of an error message?

MOT only requires OpenCL 1.2, it does not need OpenCL 2.x. It also works for me on NVIDIA systems, although it does need the latest drivers.

It is possible that the error you got is from PyOpenCL, the intermediary package I use to communicate with the OpenCL drivers. With some installations the PyOpenCL package requires OpenCL 2.1, which is not yet communicated by the NVIDIA drivers.

What I would suggest is try installing the very latest NVIDIA drivers. Let me know if it works.

Best,

Robbert

bjoern234 commented 5 years ago

Yes, you are right, the error comes from PyOpenCL:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3/dist-packages/mot/__init__.py", line 3, in <module>
    from .optimize import minimize, get_minimizer_options
  File "/usr/lib/python3/dist-packages/mot/optimize/__init__.py", line 1, in <module>
    from mot.lib.cl_function import SimpleCLFunction
  File "/usr/lib/python3/dist-packages/mot/lib/cl_function.py", line 7, in <module>
    import pyopencl as cl
  File "/usr/lib/python3/dist-packages/pyopencl/__init__.py", line 37, in <module>
    import pyopencl.cffi_cl as _cl
  File "/usr/lib/python3/dist-packages/pyopencl/cffi_cl.py", line 39, in <module>
    from pyopencl._cffi import ffi as _ffi
ImportError: /usr/local/cuda/lib64/libOpenCL.so.1: version `OPENCL_2.1' not found (required by /usr/lib/python3/dist-packages/pyopencl/_cffi.abi3.so)

mps01060 commented 4 years ago

Good morning Robbert,

I know this is months later, but I've tried the example you shared with Bjoern and it works, mostly. I have an issue where the first half of the opt_output['x'] are 1s, and the 2nd half appear to be working (actually optimizing the parameters). For example (shrinking to nmr_problems=10 for readability):

[[ 1.0000000e+00 1.0000000e+00 1.0000000e+00 1.0000000e+00] [ 1.0000000e+00 1.0000000e+00 1.0000000e+00 1.0000000e+00] [ 1.0000000e+00 1.0000000e+00 1.0000000e+00 1.0000000e+00] [ 1.0000000e+00 1.0000000e+00 1.0000000e+00 1.0000000e+00] [ 1.0000000e+00 1.0000000e+00 1.0000000e+00 1.0000000e+00] [-1.8272636e-03 1.8661604e-03 -3.5489511e-03 1.4117775e-03] [ 9.1041613e-04 1.6904876e-03 -2.8636113e-03 -2.6547226e-05] [-1.2122344e-03 -1.6787346e-03 1.5248144e-03 -2.7464307e-04] [-9.7660790e-04 1.1239090e-03 3.1461622e-04 -2.2447099e-04] [ 6.2167324e-04 -5.6695510e-03 5.7750382e-03 -1.0727227e-03]]

I'm currently trying this on a laptop with Intel 620 UHD Graphics (and an Nvidia MX130, but I don't think it's being used in calculations). The same half/half behavior happens with another custom objective function I've written. Thank you for any help, and I apologize if this is the wrong place to post this.

Edit: This behavior doesn't happen when on another PC with a single GTX 1080 (and no integrated graphics), so it's likely the "dual gpu" causing the issue on the laptop.

robbert-harms commented 4 years ago

Hi mps01060,

I see you edited the answer. I was just about to answer that on some laptops the Intel drivers do not work satisfactorily. My suggestion would be to always specify exactly which devices it should use.

To get a list of devices available, use:

from mot import smart_device_selection
devices = smart_device_selection()

to use a specific device for the computation, you have to construct a CLRuntimeInfo object and pass it to the minimization routine as:

from mot.configuration import CLRuntimeInfo
minimize(..., cl_runtime_info=CLRuntimeInfo([devices[0]]))

Combined:

from mot import smart_device_selection
from mot.configuration import CLRuntimeInfo

devices = smart_device_selection()
minimize(..., cl_runtime_info=CLRuntimeInfo([devices[0]]))

Best wishes,

Robbert

mps01060 commented 4 years ago

That worked perfectly!

Thank you, Mike

robbert-harms commented 4 years ago

Closing this topic, as the original question is over a year old. Please open a new topic or question if more information or help is needed.

robbert-harms / MOT

nelder-mead fits #2