Call PyCorrFit from other python programs

tritemio commented 8 years ago

PyCorrFit has a nice GUI to perform fitting of autocorrelation curves that is user-friendly and powerful, great job!

In my case, I would like to call PyCorrFit from an existing python session, in particular I work within Jupyter Notebook and use FRETBursts to analyze smFRET data. There are several variants of smFRET measurements with and without laser alternation, single and multi-spot etc. Therefore, it would be convenient to compute and fit the AFC on a precomputed timestamps array. For example I may want to use all photons or only photons from one alternation period. Or, in a multi-spot measurement, photon from all channels or only a sub-set of channels.

All this would be possible is PyCorrFit could accept as input a numpy array of timestamps. Then in a notebook we could run something like:

%gui qt
pcf.PyCorrFit(timestamps)

and have the PyCorrFit GUI launched in a new window with the passed timestamps.

Regarding launching QT applications from the notebook there is no problem (you can only launch one application a time but this is ok). For example, FRETBursts has a QT-based timetrace-explorer GUI that allows to scroll through long binned timetraces.

I think this adding this ability to PyCorrFit would greatly expand the usefulness of PyCorrFit. And it should be a quite simple task.

What do you think?

tritemio commented 8 years ago

Ops, I realize PyCorrFit uses WX not QT, however the above comments are still valid.

tritemio commented 8 years ago

I see that PyCorrFit requires python 2.7. In a new conda environment with python 2.7 I had to pin numpy to version 1.8 to make wxpython happy (see https://github.com/ContinuumIO/anaconda-issues/issues/565).

From within a notebook I can launch the PyCorrFit GUI by running:

%gui wx
import pycorrfit
pycorrfit.Main()

Now it's only a matter of modifying Main() or writing a wrapper function to allow passing a numpy array that is already in memory.

paulmueller commented 8 years ago

Yes, this should not be difficult to implement. Currently, the Main method

https://github.com/paulmueller/PyCorrFit/blob/master/pycorrfit/main.py#L80

only searches for command line parameters (e.g. pycorrfit /path/to/session.pcfs works). It is straight-forward to add keyword arguments and then do the following:

newpage = frame.add_fitting_tab(modelid=6000)
# Please do not use keyword arguments here, but rely on the order of parameters.
# The keywords might be subject to change in future versions.
frame.ImportData(newpage, correlation_data, trace_data)

If you would like to work on that, please create your own branch from the develop branch and then create a pull request back to this branch.

One could also think about importing the Correlation, Fit, and Trace classes directly into the pycorrfit module. Then a scenario could look like this:

import pycorrfit
correlation_data, intensity_data = my_data_acquisition()

tr1 = pycorrfit.Trace(intensity_data[0])
tr2 = pycorrfit.Trace(intensity_data[1])
corr = pycorrfit.Correlation(traces=[tr1, tr2],
                             correlation=correlation_data,
                             corr_type="CC",
                             fit_model=600)
# perform fitting without GUI
pycorrfit.Fit(corr)
# or start the GUI with
pycorrfit.Main(corr)

tritemio commented 8 years ago

@paulmueller, thanks, let's see where this leads. Before starting, a question. My input data for auto- or cross-correlation is an array of timestamps (photon arrival times, resolution 10-20ns). Your Trace data structure contains instead binned data. Do you have a function to compute the correlation directly on timestamps, without binning. In my case measurements are long an binning would be unnecessary expensive.

Also, can you upload a typical dataset you use with PyCorrFit so I can better explore all the GUI features? (Hint you can use use figshare or zenodo).

paulmueller commented 8 years ago

Sorry, I only have a multiple-tau algorithm, which works with binned data. You might get lucky here - but I have absolutely no experience with PicoQuant data file formats.

There are several example data sets in the FCSData repository. You should be able to open them with pycorrfit.readfiles.openAny. Unfortunately the code is not yet streamlined for scripting, so things might not work smoothly. Let me know when you run into trouble.

tritemio commented 8 years ago

Ah ok, I'll look into implementations of the multi-tau algorithm based on timestamps.

Just for the record, timestamped data is not specific to PicoQuant hardware. Virtually all the single-molecule measurements based on confocal (or confined) excitation using single-photon detectors record photon timestamps. For the physical format on disk there are numerous file formats (from PicoQuant, Becker & Hickl and several custom acquisition boards used for smFRET measurements) but all of them contain a "record" for each photon. This record contains the coarse photon timestamp (~10ns resolution), the detector number and (for TCSPC measurements) the "nanotime". Due to this fragmentation is very hard to share data between different programs. That's why we are promoting Photon-HDF5 as an open, common file format to make it easier to share data. Sorry for the digression, but if PyCorrFit will gain the ability to process timestamped data would be trivial to add support for Photon-HDF5, and therefore analyze an entirely new class of experiments.

paulmueller commented 8 years ago

I agree, supporting photon-hdf5 and directly computing the correlation in pycorrfit would be a great feature. I have been thinking about implementing a software-correlator plugin that accepts binned data (#20), but I simply did not have the time for that.

FCS-analysis / PyCorrFit

Call PyCorrFit from other python programs #139