rDCM: tapas_rdcm_generate.m returns 0's for endogenous DCMs

zenkavi commented 3 years ago

Hi Stefan,

I wanted to look at the predicted signal in the output of tapas_rdcm_estimate.m for resting-state data created using AR(1) processes and then inverted removing DCM.U and specifying type = 'r' as you previously suggested. I noticed output.signal.y_pred_rdcm only has zeros. I think this is because when tapas_rdcm_estimate.m specifies an empty driving input for endogenous DCM's tapas_rdcm_generate.m called by tapas_rdcm_compute_signals.m with SNR = Inf returns zeros for DCM_rDCM.Y.y(:).

Is this a bug or am I missing something? Do you have any recommendations or tricks to avoid this? I can just IFFT the output.signal.yd_pred_rdcm_fft but I wanted to see if there might be a better way.

Best,

Zeynep

StefanFraessle commented 3 years ago

Hi Zeynep,

This is correct. Since there is no driving input DCM.U in rDCM for resting-state data, there is also no way of generating a predicted signal in the time domain - which is what would be stored in output.signal.y_pred_rdcm.

The way to check whether your predicted signal resembles the true signal is by comparing output.signal.yd_pred_rdcm_fft with output.signal.yd_source_fft. These are the Fourier transforms of the temporal derivative of the data (i.e., what you are actually fitting in rDCM, see Frässle et al., 2017). Since these are complex valued, you could for instance compare the magnitudes or compute the power spectral densities. The latter is what is implemented in tapas_rdcm_visualize.m.

In general, it is important to keep in mind that rDCM is a model that operates in the frequency domain. Hence, even for task data, the correct quantities to compare are output.signal.yd_pred_rdcm_fft and output.signal.yd_source_fft - because these are the quantities that rDCM actually "uses" for model inversion.

The predicted signal in the time domain output.signal.y_pred_rdcm is generated by plugging the posterior estimates of rDCM into a classical DCM forward model. In most cases, this works fine and gives a good impression of the prediction in the time domain (for task data). But it is essentially a hack (which is necessary since the generative model in rDCM does not make any predictions in the time domain) and can therefore produce incorrect/misleading results. Hence, it's always best to look at signal and prediction in the frequency domain (see above).

All the best, Stefan

ghost commented 3 years ago

Hi Stefan,

same issue here. I will first describe what I did. I've generated simulated BOLD timeseries using a code given by Smith. Then, using tapas_model_specification I turned the timeseries into DCM structure (U.u was zeros(16*N,U) where N was number of timepoints in the timeseries and U was the number of nodes). Subsequently, I simply fed the DCM structure into tapas_rdcm_estimate with arguments 'r', empty option, and 2 for sparse rDCM. Finally, I used tapas_rdcm_visualize to see whether the predicted signal resembles the true signal. However, the predicted signal is flat zero.

I'm wondering where I made a mistake. Would be great if you can help this out.

Best regards, Younghyun

BPaton commented 3 years ago

Hi Younghyun,

As Stefan mentioned in the last email it looks like what you obtained is not a mistake on your part but is the expected behaviour. The predicted signal in the time domain is not a valid measure of how well your rDCM fit instead the FFT of it is.

Thanks.

Bryan.

On Thu, 18 Mar 2021 at 18:54, bio7420 @.***> wrote:

Hi Stefan,

same issue here. I will first describe what I did. I've generated simulated BOLD timeseries using a code given by Smith. Then, using tapas_model_specification I turned the timeseries into DCM structure (U.u was zeros(16*N,U) where N was number of timepoints in the timeseries and U was the number of nodes). Subsequently, I simply fed the DCM structure into tapas_rdcm_estimate with arguments 'r', empty option, and 2 for sparse rDCM. Finally, I used tapas_rdcm_visualize to see whether the predicted signal resembles the true signal. However, the predicted signal is flat zero.

I'm wondering where I made a mistake. Would be great if you can help this out.

Best regards, Younghyun

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/translationalneuromodeling/tapas/issues/129#issuecomment-801711312, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABYA5WZLAPNPTIFGMTKOCRLTEGWUHANCNFSM4ZJMQWNQ .

StefanFraessle commented 3 years ago

Hi Younghyun,

Thanks a lot for the question and sorry that this is so misleading. I should probably remove the predicted BOLD signal time series from the output structure in the next release of the rDCM toolbox - at least for resting-state data. This is my bad not having done it properly in the initial release.

Bryan is absolutely correct (thanks a lot, Bryan): As I have explained in my response to Zeynep, the predicted BOLD signal time series is not the right thing to look at when comparing predicted vs true signal (see above for a more detailed explanation). Instead, you should look at the power spectral densities (PSD).

You can do that by simply calling tapas_rdcm_visualize(output, DCM, options, plot_regions) (i.e., leaving the final input argument empty) as plotting the PSD is the default visualization option. Alternatively, you can explicitly specify tapas_rdcm_visualize(output, DCM, options, plot_regions, 1).

Hope this helps.

All the best, Stefan

ghost commented 3 years ago

Thanks a lot for both of you.

I knew that the fft outputs were the ones that I had to look up. But the reason that I asked if I made any mistakes was because the outputs of both fft were very weird. For output.signal.yd_source_fft, every entry is zero and for output.signal.yd_pred_rdcm_fft only the first entry has a value and the rest are zeros again. Is this normal? If the information is not sufficient to guess as to where the possible mistakes occurred, I can share my code here.

Thanks,

Best regards, Younghyun

StefanFraessle commented 3 years ago

Hi Younghyun,

Thanks a lot for the clarification and my apologies for the initial misunderstanding. What you describe does indicate a mistake - this is not correct. In particular because output.signal.yd_source_fft is simply the FFT of the (derivative of the) data and is thus not a result of model inversion. So, something might be going wrong in how you input the data to tapas_rdcm_estimate.m.

Could you kindly share your code and the DCM structure that you enter into tapas_rdcm_estimate.m with me? You could send them to me via email (you can find the address in the header of the rDCM functions).

With very best wishes, Stefan

ghost commented 3 years ago

Hi Stefan,

Thanks a lot for the help. I've sent the email.

Best regards, Younghyun

zenkavi commented 3 years ago

Hi Stefan,

Thank you for your previous clarification.

I'm still trying to invert endogenous DCMs with rDCM and my connectivity estimates for simulated data look pretty bad. I'm trying to understand what I might be doing wrong. So reading your HBM paper again I wasn't clear on a few things that I hope you might be able to help with:

what distribution do you sample connectivity and hemodynamic parameters from in your simulated datasets?
e.g. for model 4 with 25% sparsity you say the model should have 3 connectivity parameters (which the figure also shows) but my understanding from your previous responses was that inhibitory self-connections were necessary. Do you set these to 0 in your models? Or tell the model not to estimate them somehow? Or does the model actually estimate 7 parameters (4 self connections and 3 connections with other nodes)?
when evaluating model fit for simulated data which is more appropriate to use? logF or the RMSE between the estimated and true connectivity parameters? I understand you might have chosen RMSE in the HBM paper because logF's across rDCM and spectralDCM might not have been comparable but e.g. if one was to compare tapas_rdcm_sparse.m results for different p0 values should the estimate with the highest logF always be the one with the lowest RMSE? (this is not the case in my simulations and I'm not sure how to decide on the better estimate)
the magnitude of estimates differ largely between my true and estimated values. I'm wondering if this might have anything to do with the priors for the connectivity parameters. Do I understand correctly that tapas_rdcm_spm_dcm_fmri_priors.m and tapars_rdcm_get_prior_all.m together set the priors for self-connectivity to be inhibitory and narrow and all other connectivity parameters to be centered on 0 with wide distributions? Would changing these priors break anything (I'm thinking that if the shape of the distributions are the same then conjugacy shouldn't be an issue)?

Sorry to keep bombarding you with questions. Thank you so much for your patience in answering them.

Best,

Zeynep

StefanFraessle commented 3 years ago

Hi Zeynep,

Here are my thoughts on your questions:

We sampled the true parameter values from the default prior distributions on the respective parameters in SPM12. With one additional criterion: Since for some of the parameters (e.g., connectivity parameters), the priors are shrinkage priors (with a mean of or close to zero), we ensured that sampled true parameters were "big enough" by simply resampling the value whenever it was too close to zero. I think we used a threshold of > 0.1.
My apologies if this was unclear. In the paper, we had used the term "connectivity parameters" very deliberately to describe the off-diagonal elements of the A matrix. The diagonal elements would be the intrinsic parameters or inhibitory self-connections. In other words, yes, for all models, the intrinsic connections were also estimates. That means for model 4, there were 7 free parameters that had to be estimated.
I would say that these are two different things. In order to choose the best p0 value (in the context of rDCM with sparsity constraints) or the best model in general, you always use the negative free energy as this is a measure of model goodness that takes into account both accuracy and complexity of the model. Conversely, if you are interested in how well a method can recover some ground truth parameter values, then you use something like the RSME or the Pearson's correlation between true and estimated parameter values. But these are two different scenarios. And regarding your questions: No, there is no principled reason why the best p0 should be the one with the highest negative free energy. I agree that this would be desirable, but there is no guarantee for this. For instance, there may be an explanation of the data that is sparser then the true connectivity pattern, which has a low complexity but still fits the data almost as well as the true (fuller) network architecture. In this case, its very likely that the RMSE of the sparser model (p0) will be worse but the negative free energy will be higher.
In principle, tapas_rdcm_get_prior_all.m is simply a wrapper function that calls tapas_rdcm_spm_dcm_fmri_priors.m to get the neuronal priors (and that additionally sets the noise priors). Changing these priors will certainly impact on the behavior of the model - but I would be a bit cautious to change them as there is typically - at least some - motivation for the chosen values. For instance, the zero mean ensures that the priors are so-called shrinkage priors which pull the effect closer to zero. This ensures a regularizing effect on the model which is meant to prevent overfitting. In other words, you can change the priors - but I would advise only doing so if you have a good motivation how and why to do it.

One additional thing that is probably worth highlighting. In the HBM paper, we only tested rDCM with fixed network architecture, but have not tested rDCM with sparsity constraints on resting-state data yet. Hence, I cannot say with certainty how well the model should behave... and, in fact, whether sparse rDCM works at all for resting-state fMRI. This is something that we still have to test ourselves in careful simulations and empirical applications. While I do not see a principled reason why it should not work, there is always the chance that the model does not perform as expected. Hence, I would advise to be cautious when working with rDCM with sparsity constraints in combination with resting-state data :-)

p.s: just for the sake of completeness: task-based fMRI data should be fine - this is what we have tested in the 2018 paper.

Hope this helps :-)

All the best, Stefan

zenkavi commented 3 years ago

Great! This is all very helpful. Thank you again for all your help!

translationalneuromodeling / tapas

rDCM: tapas_rdcm_generate.m returns 0's for endogenous DCMs #129