tritemio / PyBroMo

A simulator for single molecule FRET experiments of freely diffusing particles. Moved to:
https://github.com/OpenSMFS/PyBroMo
GNU General Public License v2.0
17 stars 10 forks source link

Improvement: simulation generates photon timestamps directly #4

Open lampo808 opened 8 years ago

lampo808 commented 8 years ago

I want to use PyBroMo to simulate timestamps generated from a single-molecule experiment.

tl;dr: it would be great to add an option to PyBroMo.simulate_diffusion to save only the timestamps.

My plan is to run a series of simulations to understand which experimental parameters I need to tweak to be able to achieve a good SNR. To save disk space, instead of following the example of PyBroMo - B.1 Disk-single-core - Generate photon timestamps.ipynb I'm generating intensity traces and then calculating timestamps from sratch using the following code:

S = pbm.ParticlesSimulation(t_step=t_step, t_max=t_max,
                            particles=P, box=box, psf=psf)

S.simulate_diffusion(total_emission=True, save_pos=False, verbose=True,
                        rs=rs, chunksize=2**19, chunkslice='times')

trace = S.emission_tot[:]  # Intensity trace
S.store.close()

photons = np.random.poisson(trace)
photons += np.random.poisson(np.ones_like(photons)*bkg*t_step)  # Add uncorrelated background

timestamps = []
for (n, t) in zip(photons, np.cumsum(np.ones_like(photons)*t_step) - t_step):
    if n > 0:
        for i in range(n):
            # If more than one photon fall into the same simulation step,
            #  distribute them evenly
            timestamps.append(t + t_step/n * i)

ts = np.array(timestamps)

Working with the intensity trace is rather memory-consuming, so it would be great if the timestamps could be generated directly by the simulation function.

tritemio commented 8 years ago

@lampo808, if you look at the method ParticlesSimulation.simulate_timestamps_mix_da_online it does what you need for a mixture of multiple FRET populations. If you want to do only a single-color version you could simplify that method removing the duplication that is there to simulate both donor and acceptor channel, maybe call it ParticlesSimulation.simulate_timestamps_mix_online. If you send a PR to add this method I would be happy to merge it.

Let me know if you need further help.

lampo808 commented 8 years ago

Great, didn't notice that function. I will do like you suggest, thanks.

lampo808 commented 8 years ago

@tritemio, while testing the modified method ParticlesSimulation.simulate_timestamps_mix_online I found the following error, that is related to the original method:

In[8]: S.simulate_timestamps_mix_da_online(b, b, (slice(0, S.num_particles),), bkg, bkg, pbm.hash_(rs.get_state()), timeslice=t_max)

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-8-bc8096f32957> in <module>()
----> 1 S.simulate_timestamps_mix_da_online(b, b, (slice(0, S.num_particles),), bkg, bkg, pbm.hash_(rs.get_state()), timeslice=t_max)

C:/Users/Marco/Work/Work In Progress/PyBroMo\pybromo\diffusion.py in simulate_timestamps_mix_da_online(self, max_rates_d, max_rates_a, populations, bg_rate_d, bg_rate_a, rs, seed, chunksize, comp_filter, overwrite, skip_existing, scale, path, t_chunksize, timeslice)
   1023                 `timeslice` seconds. If None, simulate until `self.t_max`.
   1024         """
-> 1025         self.open_store_timestamp(chunksize=chunksize, path=path)
   1026         rs = self._get_group_randomstate(rs, seed, self.ts_group)
   1027         if t_chunksize is None:

C:/Users/Marco/Work/Work In Progress/PyBroMo\pybromo\diffusion.py in open_store_timestamp(self, path, chunksize, chunkslice, mode)
    490             return
    491         if path is None:
--> 492             path = self.store.filepath.parent
    493         self.ts_store = self._open_store(TimestampStore,
    494                                          prefix=ParticlesSimulation._PREFIX_TS,

AttributeError: 'ParticlesSimulation' object has no attribute 'store'

Any idea of what's going wrong? The zipped notebook to reproduce the issue is attached.

Thanks

photon timestamps from intensity trace.zip

tritemio commented 8 years ago

@lampo808, the notebook you attached does not have any error, maybe you sent the wrong one. Also please use the latest version from master of pybromo (in the notebook you sent a very old version is used).

tritemio commented 8 years ago

@lampo808, I was able to reproduce your error trying to reverse engineer what you are doing. I took your notebook and inserted the command you posted above but it did work because the syntax is completely wrong (you need to pass lists of brightnesses and the argument rs takes a RandomState object not a string). The "right" command is:

S.simulate_timestamps_mix_da_online([b,], [b,], 
                                    [slice(0, S.num_particles),], bkg, bkg)

Where b is the detected brightness and bkg is the background rate. This gives the error that you report. Now I can start debugging, but please in the future report a reproducible example.

tritemio commented 8 years ago

@lampo808, the proper way to call that method is:

S.simulate_timestamps_mix_da_online([b,], [b,], [slice(0, S.num_particles),], bkg, bkg, 
                                    path='./', skip_existing=True)

Explanation: when there is no trajectory file open, the method simulate_timestamps_mix_da_online requires a path for the output data. Also, since you are simulating two identical timestamp arrays (for donor and acceptor) you need to pass skip_existing=True. This flag is a safeguard against using the same identical photon timestamps for two photon streams. If you set it to True means that you are aware of it and you know what you are doing (it's fine in this case).

Yes, the error message does not help in this case and the method should also work without a path (just using the current folder is not specified). I'll fix this.

tritemio commented 8 years ago

I thought that with all these arguments is better to use keyword-argument to help when reading and avoiding errors. For the reference, please use:

S.simulate_timestamps_mix_da_online(max_rates_d=[b,], max_rates_a=[b,], 
                                    populations=[slice(0, S.num_particles),], 
                                    bg_rate_d=bkg, bg_rate_a=bkg, skip_existing=True)

Another possible improvement can be that when populations is not specified there is only one population for all the particles (avoiding the ugly [slice(0, S.num_particles),] definition above).

lampo808 commented 8 years ago

@tritemio sorry for the notebook, the modifications I did were not saved (including using the last version of PyBroMo). I will double check next time.

Thanks for the solution and explanation, I can take care of the automatic path selection were none is passed and the improvement you proposed for the populations, if that's fine for you.

tritemio commented 8 years ago

@lampo808 sounds good! A few tips:

For the path you can modify open_store_timestamps. Right now if path is None the code tries to use the same folder as the trajectory file (and fails if there is no trajectory file). A better behaviour would be that if path is None it first tries to use the path of the trajectory file and, if there is no trajectory file, then use '.'.

For the populations argument, you can modify the low level methods _get_ts_name_mix_core and _sim_timestamps_populations so that all the other high-level methods will support populations=None (including your new method for single-color timestamps simulation).