RNA-FRETools / MASH-FRET

MATLAB package to analyze single-molecule FRET data
https://rna-fretools.github.io/MASH-FRET/
GNU General Public License v3.0
8 stars 2 forks source link

Simulation kinetic model not matching video project kinetic model; video project traces finding more states than was intended with not enough data to generate ml-dph #123

Closed snguyen49 closed 8 months ago

snguyen49 commented 9 months ago

After applying the background corrections, the video project generated a kinetic model containing more states than was expected as per the original simulation kinetic model. I simulated a 2-state system originally but the video project will sometimes indicate more than two states. I am guessing that these states have lower probabilities of occurring, but because of this the ml-dph is sometimes not able to be generated as there are not enough data points to fit the dwell time histograms to be used for the ml-dph and monte carlo simulation. When it is able to generate a kinetic model, the kinetic model is not the same as the original kinetic model from the simulation. Here are my images for the 2-state simulation I ran: image image

Here are the images from the video project using the .sira video file from the above simulation: image image

I am wondering if I am doing anything wrong or is the program supposed to be this way?

Many Thanks, Sydney Nguyen

mca-sh commented 8 months ago

Hi Sydney and thanks again for your report.

I think the problem is not coming from the video processing but from the trace processing. I have two ideas how did it come to this:

time-averaging of states

Looking at your simulated model, I can see that the states 0 and 1FRET are very short lived (lifetime of 11 data points). This means that 50% of the dwell times are shorter than 11 data points and that a large part is even shorter than 1 data point. In this case, transitions between 0 and 1FRET happen faster than the sampling time of the video and will be average in an "intermediate" FRET state that have approximately the average value of 0.5FRET. The simulated FRET state trajectories do not show this state as they give you the simulation ground truth, but the "experimental" state trajectories will display this intermediate state since the find-state algorithm is not aware that this is not a real state. This is a limitation of the data and can be solved by increasing the time resolution in your simulation (shorter frame rate).

blurr state

The state-finding algorithm vbFRET is known to detect what we call a "blurr state" when a transition occur between two FRET states. This is also due to time-averaging of these states but only over one data point. This will yield an artefactual FRET state having a lifetime close to 1 data point. This is a limitation of the state-finding algorithm and can be corrected by activating the post-processing method "deblurr" in panel "Find states" (https://rna-fretools.github.io/MASH-FRET/trace-processing/components/panel-find-states.html#remove-blurr-states).

In both case these are limitations of the data and state-finding algorithm, you are not doing anything wrong. Please tell me if this solved your problem.

Best, Mélodie

snguyen49 commented 8 months ago

Thank you Melodie. We had an emergency in the lab so I was not able to get to my computer until today. I will get back to you on the progress as soon as possible.

Sydney Nguyen


From: Mélodie Hadzic @.> Sent: Monday, January 22, 2024 3:20 AM To: RNA-FRETools/MASH-FRET @.> Cc: Sydney Nguyen @.>; Author @.> Subject: Re: [RNA-FRETools/MASH-FRET] Simulation kinetic model not matching video project kinetic model; video project traces finding more states than was intended with not enough data to generate ml-dph (Issue #123)

Hi Sydney and thanks again for your report.

I think the problem is not coming from the video processing but from the trace processing. I have two ideas how did it come to this:

time-averaging of states

Looking at your simulated model, I can see that the states 0 and 1FRET are very short lived (lifetime of 11 data points). This means that 50% of the dwell times are shorter than 11 data points and that a large part is even shorter than 1 data point. In this case, transitions between 0 and 1FRET happen faster than the sampling time of the video and will be average in an "intermediate" FRET state that have approximately the average value of 0.5FRET. The simulated FRET state trajectories do not show this state as they give you the simulation ground truth, but the "experimental" state trajectories will display this intermediate state since the find-state algorithm is not aware that this is not a real state. This is a limitation of the data and can be solved by increasing the time resolution in your simulation (shorter frame rate).

blurr state

The state-finding algorithm vbFRET is known to detect what we call a "blurr state" when a transition occur between two FRET states. This is also due to time-averaging these states but only over one data point. This will yield an artefactual FRET states having a lifetime close to 1 data point. This is a limitation of the state-finding algorithm and can be corrected by activation the post-processing method "deblurr" in panel "Find states" (https://rna-fretools.github.io/MASH-FRET/trace-processing/components/panel-find-states.html#remove-blurr-states).

In both case these are limitation of the data and state-finding algorithm, you are not doing anything wrong. Please tell me if this solved your problem.

Best, Mélodie

— Reply to this email directly, view it on GitHubhttps://github.com/RNA-FRETools/MASH-FRET/issues/123#issuecomment-1903469964, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCCRYBFKIBDR5NKFKV47VYLYPYOMJAVCNFSM6AAAAABCCULDVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBTGQ3DSOJWGQ. You are receiving this because you authored the thread.Message ID: @.***>

CAUTION: This email was sent from someone outside of the university. Do not click links or open attachments unless you recognize the sender and know the content is safe.

snguyen49 commented 8 months ago

After following your instructions, it seems that it is still occurring. Should I send my project to you to see it?

Many Thanks, Sydney Nguyen


From: Mélodie Hadzic @.> Sent: Monday, January 22, 2024 3:20 AM To: RNA-FRETools/MASH-FRET @.> Cc: Sydney Nguyen @.>; Author @.> Subject: Re: [RNA-FRETools/MASH-FRET] Simulation kinetic model not matching video project kinetic model; video project traces finding more states than was intended with not enough data to generate ml-dph (Issue #123)

Hi Sydney and thanks again for your report.

I think the problem is not coming from the video processing but from the trace processing. I have two ideas how did it come to this:

time-averaging of states

Looking at your simulated model, I can see that the states 0 and 1FRET are very short lived (lifetime of 11 data points). This means that 50% of the dwell times are shorter than 11 data points and that a large part is even shorter than 1 data point. In this case, transitions between 0 and 1FRET happen faster than the sampling time of the video and will be average in an "intermediate" FRET state that have approximately the average value of 0.5FRET. The simulated FRET state trajectories do not show this state as they give you the simulation ground truth, but the "experimental" state trajectories will display this intermediate state since the find-state algorithm is not aware that this is not a real state. This is a limitation of the data and can be solved by increasing the time resolution in your simulation (shorter frame rate).

blurr state

The state-finding algorithm vbFRET is known to detect what we call a "blurr state" when a transition occur between two FRET states. This is also due to time-averaging these states but only over one data point. This will yield an artefactual FRET states having a lifetime close to 1 data point. This is a limitation of the state-finding algorithm and can be corrected by activation the post-processing method "deblurr" in panel "Find states" (https://rna-fretools.github.io/MASH-FRET/trace-processing/components/panel-find-states.html#remove-blurr-states).

In both case these are limitation of the data and state-finding algorithm, you are not doing anything wrong. Please tell me if this solved your problem.

Best, Mélodie

— Reply to this email directly, view it on GitHubhttps://github.com/RNA-FRETools/MASH-FRET/issues/123#issuecomment-1903469964, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCCRYBFKIBDR5NKFKV47VYLYPYOMJAVCNFSM6AAAAABCCULDVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMBTGQ3DSOJWGQ. You are receiving this because you authored the thread.Message ID: @.***>

CAUTION: This email was sent from someone outside of the university. Do not click links or open attachments unless you recognize the sender and know the content is safe.

snguyen49 commented 8 months ago

Here is what I have inputted for the parameters. image When I cluster the TDP, it still gives me three states, but the Trace processing page tells me there were only 2 states detected. The FRET histograms also indicate only two states. The third intermediate state is also found in the TDP to have a FRET efficiency that is not like the other two states which means it is not a degenerate state. image

At first, I used Thresholding for the method of finding states but I switched to vbFRET and the trace processing found 3 states instead of 2 on this occasion. image

Something even more perplexing is that the FRET distributions do not indicate a third intermediate state, but when I run it using the vbFRET method it tells me there are three states. When applying different methods, the program would find even more states. That leaves me to wonder which method is the most efficient to use and when is a method applicable to a system.

I had switched my methods from TwoTone as well in the video processing page, but with both Houghpeaks and Twotone generated an extra state in the TDP. For TwoTone, when I input an intensity parameter, the program cannot find any spots at all. Is there a reason for this? I would also like to know how to utilize TwoTone since my project works with TIRF microscopy and it was said on your website that this setting is specifically for this microscope set-up.

mca-sh commented 8 months ago

Hi Sydney, I hope nothing too serious happened to your lab. Sorry to hear that it did not fix the problem.

The thresholding method can be used only when FRET states are well separated and state dynamics are not too fast. The best is to use vbFRET to obtain the most reliable rate constants.

The spot finding method (Twotone, Houghtpeaks, etc..) are actually all suitable for TIRF video and do not influence the TDP, you are free to choose whichever. You can stick with Houghpeaks if it works for you. If you want to use Twotone, you must use an intensity threshold lower than the value determined form the average image since it is used on a Gaussian-filtered image (that is not available for display), where pixel intensities were lowered. The best is to play around with different values.

For the extra-state problem I have an idea: when you are changing the parameters in "Find states" or any other panel in Trace processing, and you want these settings to be applied to all molecules in the list, you need to press the corresponding "All" button. If you did not do this, the Thresholding method will still be applied to all the molecules other than the current one. This could explain this transition density at 0.5FRET in the TDP.

If this does not solve the problem, I will need your project and video to have a look because I am running out of ideas..

Best, Mélodie

snguyen49 commented 8 months ago

Hi Melodie,

Thank you for the explanation. Unfortunately, even after making sure the settings were applied to all of the molecules. The TDP is still finding an intermediary state. Here is the project file I have currently which includes the simulation video used: videoprojecttest.zip

Many Thanks, Sydney Nguyen

mca-sh commented 8 months ago

Hey Sydney, thank you for the file.

So my first guess seems to be correct, the system you've simulated has state dynamics that are too fast for the time resolution. Have a look at this close up on the FRET trajectory:

image

We can see the presence of an artefactual FRET "state" that results in the time-averaging of fast transitions between 0FRET and 1FRET. The state-finding algorithm is not able to detect if the state is artefactual or not. To obtain data that are well resolved and prevent such artefact, you need to increase the video frame rate used in the Simulation. For more information, I will just quote myself:

time-averaging of states

Looking at your simulated model, I can see that the states 0 and 1FRET are very short lived (lifetime of 11 data points). This means that 50% of the dwell times are shorter than 11 data points and that a large part is even shorter than 1 data point. In this case, transitions between 0 and 1FRET happen faster than the sampling time of the video and will be average in an "intermediate" FRET state that have approximately the average value of 0.5FRET. The simulated FRET state trajectories do not show this state as they give you the simulation ground truth, but the "experimental" state trajectories will display this intermediate state since the find-state algorithm is not aware that this is not a real state. This is a limitation of the data and can be solved by increasing the time resolution in your simulation (shorter frame rate).

Tell me if it worked or not.

Best, Mélodie

snguyen49 commented 8 months ago

I currently have the frame rate as 10. Should I go even less than that? Also, I apologize I had mistakenly read your previous message as frame length not frame rate.

mca-sh commented 8 months ago

Aah ok I see.

It depends on your transition rates. Apparently, 10 frame/second is not resolved enough for the transition rate constants.

Can you please tell me what are the transition rate constants you used (in second-1, using menu Units>Time>in seconds)?

snguyen49 commented 8 months ago

I do not know where to find the rate constants in seconds-1 but I do have them in frames-1. I have the simple two states with 0.1 for all the transition rate constants.

mca-sh commented 8 months ago

You can access transition rate in second-1 by changing the time units in MASH's menu bar Units>Time>in second.

It seems that you are reaching the limit, at least for vbFRET. If you have transition rates in frames-1 then you must decrease these values instead of increasing the frame rate. Try with 2 times larger frame rate (if time unit sin second) or transition rate constants of 0.05.

Otherwise, you can use your current data and limit vbFRET to find 2 state maximum. Limiting vbFRET to 2 states requires of course prior knowledge about the system and should not be done with experimental data.

snguyen49 commented 8 months ago

I apologize but I cannot find the icons that you have mentioned. image This is what it looks like for me. I will try out the frame settings you have mentioned.

mca-sh commented 8 months ago

The frame settings are fine too. You can find the menu Units on the very top of MASH's window.

I tried out several things on my side, and it turned out that even with slower rate constants, vbFRET is inferring more than 2 states. I think this is due to the extrem FRET values 0 and 1, where the noise distribution becomes assymetric. To prevent such misfunction, you can use the state-finding method STaSI+vbFRET, where the number of states in the trajectories determined by STaSI instead of vbFRET, but the dwell times are determined by vbFRET.

I did not expect such simple system to mess up the routine ^^

EDIT: indeed, after some additional tests, it seems that using 0 and 1FRET as state values induces a bias in most state finding method, I am not sure why. But using 0.2 and 0.7FRET solved the issue, when using STaSI+vbFRET. If you still want to use 0 and 1 FRET in your simulation, then I can only recommend to use Thresholds as a state-finding algorithm.

I hope this helps..

EDIT2: Thresholds needs to be parameterized correctly. In your case, you need: image and image

snguyen49 commented 8 months ago

Okay let me try that out. Do you still think it is preferable to have double the original frame rate?

mca-sh commented 8 months ago

If you use Threshold as a state-finding algorithm, it won't find an additional artefactual state since it is looking for 2 states 0 and 1FRET (regardless the frame rate and transition rate constants). But at this stage, you can also restrict vbFRET to 2 states max.

snguyen49 commented 8 months ago

When I kept the states at 0 and 1 and changed the method to threshold, this is the TDP that I get. This is also after I had doubled the frame rate. image

After making the FRET values 0.2 and 0.7, the TDP found two states, but I was wondering whether this discrepancy would become a problem for real-world experiments.

mca-sh commented 8 months ago

Hi Sydney,

To be perfectly honest with you, no state-finding or model selection method is flawless. If you have experimental or theoretical arguments that your system is a two-state system, then it is better to restrict all these method to 2 states.

In this case, state-finding method Threshold is configured to generate state sequences between 0 and 1FRET states only, but the option adjust to data recalculate these FRET state values according to the trajectory behind. This is why you end up with additional 0.3, 0.6 and 0.8 FRET states that appear on the TDP and that the Gaussian-mixture model-based clustering can not ignore. You have the right to (and must!) question the viability of the optimum model found by GMM clustering. If you disagree with the model complexity you can visualize the clustering results of another one by selecting it in the list V in sub-panel Results of panel Transition analysis' panel State configuration, and press Use this config. to use this clustering for the rest of the analysis.

I know I recommended to use Thresholds, but I regret it since it is a very heuristic method that is in general less accurate than the other. I recommended it in a first place because the top-ranked method vbFRET does not handle correctly fast state dynamics since it finds one or several artefactual blurr states, which overestimate the number of states in the trajectories. BUT, when restricted to 2 states, it should do the job much better than Thresholds that is anyway also restricted to 2 states.

I hope all this is not too confusing. Best, Mélodie

snguyen49 commented 8 months ago

Hi Melodie,

I am good with this outcome. I just wanted to make sure this outcome was not representative of a problem in the program and that it was something to be expected. This just means I will have to establish a different approach when analyzing the data. Thank you for taking the time to help me.

Sydney

mca-sh commented 8 months ago

Ok great, don't hesitate to post a new issue if you have other concerns. I will close this one for now.

Best, Mélodie

snguyen49 commented 8 months ago

[heart] Sydney Nguyen reacted to your message:


From: Mélodie Hadzic @.> Sent: Monday, January 29, 2024 5:29:48 PM To: RNA-FRETools/MASH-FRET @.> Cc: Sydney Nguyen @.>; Author @.> Subject: Re: [RNA-FRETools/MASH-FRET] Simulation kinetic model not matching video project kinetic model; video project traces finding more states than was intended with not enough data to generate ml-dph (Issue #123)

Ok great, don't hesitate to post a new issue if you have other concerns. I will close this one for now.

Best, Mélodie

— Reply to this email directly, view it on GitHubhttps://github.com/RNA-FRETools/MASH-FRET/issues/123#issuecomment-1915227378, or unsubscribehttps://github.com/notifications/unsubscribe-auth/BCCRYBFQSTXRRDPVXWFS7VLYQ7MAZAVCNFSM6AAAAABCCULDVGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMJVGIZDOMZXHA. You are receiving this because you authored the thread.Message ID: @.***>

CAUTION: This email was sent from someone outside of the university. Do not click links or open attachments unless you recognize the sender and know the content is safe.