huppertt / nirs-toolbox

Toolbox for fNIRS analysis
86 stars 61 forks source link

Short-separation regression handling NaNs. #20

Open AlinaSchulte opened 1 year ago

AlinaSchulte commented 1 year ago

Is your feature request related to a problem? Please describe. I would like to run a GLM including short-channel regression. I run into problems when setting short channels with unsufficient data quality to NaN and adding short-channel regressors as nusiance regressors to the model.

Describe the solution you'd like It would be helpful if the GLM can deal with NaNs in short-channels, so it calculates the PCA of only the good quality short-channels that acutally capture the physiological noise.

superhenrikke commented 1 year ago

I am in a similar situation. We'd like to use the short channel distance filter, but we don't want to input bad short channels into the filter. However, I have been able to use the short distance filter, if I set the bad channels (both long and short channels) to zero. I am not sure whether this is how the short-distance filter is intended to work or if I'm circumventing the issue.

To elaborate: We have several bad short channels that we'd like to discard. For this pruning, we're currently using QT-NIRS and setting the bad (long/regular) channels to 0 and the short channels to 0. Afterward, I'm using the short channel filter: advanced.nirs.modules.ShortDistanceFilter(). When I tried setting the bad channels to nan before running advanced.nirs.modules.ShortDistanceFilter() it would result in an error.

Hope this helps!

AlinaSchulte commented 1 year ago

Oh it's interesting that you are using the ShoortDistanceFilter, I was using the option of adding a short channel regressor to the GLM so far: jobs = nirs.modules.GLM(); jobs.AddShortSepRegressors = true;

I haven't gone deep enough into the functions to understand what the difference is. Happy to hear, if anyone knows that :)

superhenrikke commented 1 year ago

I am using the ShortDistanceFilter as a part of preprocessing before conducting hyperscanning analysis, i.e., we're not going to run a GLM. If I were running a GLM then I would do the same as you have written above. :)

(I am not the one to explain the difference between the two, but I believe this paper does: 10.1117/1.NPh.7.3.035009 )

Have you tried setting the bad short channels to zero before running the GLM?

Frkle commented 1 year ago

Hey Alina,

I never have problems with setting channels with poor data quality to NaN. And it does work for you if you do not set them to NaN, i.e., adding simply all short distance channel to the model?

Best F

AlinaSchulte commented 1 year ago

Thanks for your comments and reminding me of that paper!

Not setting them to NaN or setting them to zero does work, but changed the SS PCA regressors slightly, if I remember correctly. And if I understand correctly, all prinicple components are used as regressors of no interest to capture noise in the model? Setting "AddShortSeperationRegressors" to true, creates 8 SS PCA conditions in my case. Is that the same for you and independent of the number of shortchannels used?

Frkle commented 1 year ago

Hey,

my question refers to a slightly different thing, I guess. Why running a PCA and then adding the PCs and not adding simply all SDCs (of both HbO and HbR) to the GLM as regressors as is recommended (e.g., https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7511246/ and https://www.spiedigitallibrary.org/journals/neurophotonics/volume-10/issue-01/013503/Performance-comparison-of-systemic-activity-correction-in-functional-near-infrared/10.1117/1.NPh.10.1.013503.full?SSO=1). If you take only the first n PCs of all available SDC information you would actually neglect some of the information of the extracerebral systemic activity and if you do not have HD-DOT data with hundreds of SDCs then I do not see an advantage of the PCA approach. Moreover, the GLM can handle noise SDC data if you add them as regressors and therefore, it is not much of a problem to leave them in the data set, that is, not replacing them with NaNs (at least if you add the SDCs directly to the model) (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7511246/ and ).

However, if you want to run a PCA on the SDC before, then you should not replace the data with NaN, that does not work with PCA. You could remove the whole channel instead.