kleinbub / rMEA

an R package to perform synchronization analysis on motion energy time-series
GNU General Public License v3.0
15 stars 2 forks source link

Use fractional times for MEAccf input parameters #10

Closed audiophil-dev closed 11 months ago

audiophil-dev commented 2 years ago

Why are the time parameters for the MEAccf function limited to full seconds? Is it somehow possible to use values lower than 1 second:

mea_all_ccf <- MEAccf(mea_all_rescaled, lagSec= 1, winSec= 1.5, incSec=0.5)

Would it be enough to get rid of the timeMaster calls in this file?

If not, could you give some hints on which parts of the package need to be changed to render fractional times possible?

kleinbub commented 2 years ago

The ccf code per se is robust and allows fractional times. The issue is that all the rest of the package is built with seconds as the minimal unit, as I didn't think it makes much sense to use fraction of seconds with MEA data. So switching the whole package to milliseconds would require a thorough check of the whole codebase to control where time in seconds is explicitly or implicitly expected.

Nevertheless, I have created a new branch "fractional" that doesn't give errors using fractional times. But NOTE: the start and end values of windows in the ...$ccfRes$winTimes object are kept in a numeric format and not in a mm:ss format. Also, there might be untested issues further down the pipeline, when using the ccf data for other procedures (e.g. plotting, or bootstrapping). I'd be glad if you were willing to test and/or to contribute to the branch

audiophil-dev commented 2 years ago

Thanks for your quick answer and the new branch!

I use "Quantity of Motion" data, calculated from Motion Capture data. My first calculations showed that I can achieve the highest significance (comparing the grand averages of different conditions) with a window size of 1 second. So I wanted to try smaller window sizes than 1 second.

Your new branch works for fractional times, and I did not face any problems so far. I don't use winTimes, so it is not a problem for me that the values are numeric. I did not have time to check the whole codebase, but everything I needed for my calculations worked so far (also plotting).

Since the MEAheatplot becomes too dense with small window sizes, I added the input parameters from and to to MEAheatplot. I made a fork of your package if you want to include these changes.

Right now, I am a bit short on time for implementing fractional times for the whole codebase, but I am happy to help if there is anything to test. If I ran into problems while using the branch, I will try to fix them...

kleinbub commented 2 years ago

Thank you for your testing! I would beware of selecting the window in a data-driven fashion and not based on the type of phenomenon you are measuring. Indeed, extremely short windows contain very scarce information about human behavior. Correlations based on these almost instantaneous assessments, are most probably spurious. If you plot the data from each window you will see what I mean, probably flat lines without any feature. If you still want to go down this road, at least try some cross validation approaches, e.g. by rerunning the analyses with a random subsample from each of your group, and see whether the result with shorter windows is stable.

In regard to the fork, when you feel it's stable you can create a pull request, and I'll integrate everything in the main version.

audiophil-dev commented 2 years ago

I see your point, thanks for the hint! It's hard for me to choose a proper window size based on theory. On the one hand, I am not a behavioural scientist, on the other hand, the data is based on unstructured interaction.

If you plot the data from each window you will see what I mean, probably flat lines without any feature.

I don't really get what you mean. Which kind of plot are you referring to?

Here is a MEAlagplot with the three conditions and the random sample (window time = 0.5s):

Rplot.

To me, it looks like there are differences between the conditions.

If you still want to go down this road, at least try some cross validation approaches, e.g. by rerunning the analyses with a random subsample from each of your group, and see whether the result with shorter windows is stable.

Indeed the results do not seem to be stable. But neither do they for larger window sizes. Probably the sample size is too small...

In regard to the fork, when you feel it's stable you can create a pull request, and I'll integrate everything in the main version.

Perfect!