pwollstadt / IDTxl

The Information Dynamics Toolkit xl (IDTxl) is a comprehensive software package for efficient inference of networks and their node dynamics from multivariate time series data using information theory.
http://pwollstadt.github.io/IDTxl/
GNU General Public License v3.0
237 stars 76 forks source link

NumbaCudaKraskovCMI not found #87

Closed Rhydderch closed 1 year ago

Rhydderch commented 1 year ago

Hi,

I installed IDTxl in a conda environment. I was able to use CL estimators successfully. However, there doesn't seem to be the CUDA estimators in this version of the package?

image

Does that mean the support for the NUMBA estimator has been dropped? If so, the docs are indeed up to date, but not this page: https://github.com/pwollstadt/IDTxl/wiki/CMI-Estimators

mwibral commented 1 year ago

HI Rhydderch,

thank you for pointing this out. We will fix the documentation. However, possibly the code you where looking for is still on brach dev_numba?

Best Regards, Michael Wibral

On Wed, 2022-10-26 at 07:09 -0700, Rhydderch wrote:

Hi, I installed IDTxl in a conda environment. I was able to use CL estimators successfully. However, there doesn't seem to be the CUDA estimators in this version of the package? image Does that mean the support for the NUMBA estimator has been dropped? If so, the docs are indeed up to date, but not this page: https://github.com/pwollstadt/IDTxl/wiki/CMI-Estimators — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

Rhydderch commented 1 year ago

Hi Michael,

I did not think of checking the dev_numba branch. Thanks for letting me know. (Though, I'm already quite happy with the substantial improvement OpenCL offers)

By the way, thank you for making this package.

Best regards, Rhydderch

mwibral commented 1 year ago

Dear Rhydderch,

just wanted to let you know that Opencl GPU acceleration is particulalrly good for problems with a medium amount of data (say <=150.000 samples) and a high number of surrogate data (>200).

Above that, CPUs work better, because their algorithm scales better (n log n instead of n²). We also have mpi-based acceleration for parallesiation across CPU cores (possibly also still just in a dev- branch, but very mature). It works well on a single computer with many cores (we have some dual socket machines with 128 cores here, and it scales very well). Using multiple machines is also technically possible, but at present does not scale well. We are working on this.

Michael

On Wed, 2022-10-26 at 09:18 -0700, Rhydderch wrote:

Hi Michael, I did not think of checking the dev_numba branch. Thanks for letting me know. (Though, I'm already quite happy with the substantial improvement OpenCL offers) By the way, thank you for making this package. Best regards, Rhydderch — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Rhydderch commented 1 year ago

Thanks for letting me know! Luckily, I'm only working with an array of 6 processes, 56 samples, 40 replications (participants, in my case).

May I email you with some questions regarding the use of transfer entropy with psychological data? I'm new to information theory and I apparently am the first person to apply transfer entropy to psychological time-series; so I am a bit afraid of making mistakes (e.g., wrong inferences based on the results). I did get it to run from within R on a windows machine, so I'm quite happy about that.

mwibral commented 1 year ago

Hmm, 56 samples x 40 replications = 2.240 sounds like very very little data. I would assume that you will not get statistically significant results from our network analysis.

I typically do not start with less then 10.000 data points, but let us know how it works for you.

Best, Michael

On Fri, 2022-10-28 at 02:50 -0700, Rhydderch wrote:

Thanks for letting me know! Luckily, I'm only working with an array of 6 processes, 56 samples, 40 replications (participants, in my case). May I email you with some questions regarding the use of transfer entropy with psychological data? I'm new to information theory and I apparently am the first person to apply transfer entropy to psychological time-series; so I am a bit afraid of making mistakes (e.g., wrong inferences based on the results). I did get it to run from within R on a windows machine, so I'm quite happy about that. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Rhydderch commented 1 year ago

I tried several lags (max 5 and min either equal to 1 or 0) and, to my surprise, found several highly significant results at p < .002 (using the Kraskov estimator).

Result for a single target (fdr true) after a network analysis with {'cmi_estimator': 'OpenCLKraskovCMI', 'max_lag_sources': 3, min_lag_sources': 1} :

{'sources_tested': [0, 1, 2, 3, 4], 
'current_value': (5, 4), 
'selected_vars_target': [(5, 1), (5, 4)], 
'selected_vars_sources': [(1, 0), (0, 0), (1, 3), (1, 1)], 
'selected_sources_pval': array([0.002, 0.002, 0.002, 0.002]), 
'selected_sources_te': array([0.10063591, 0.10319266, 0.08619502, 0.08437258]), 
'omnibus_te': array([0.17961366]), 
'omnibus_pval': 0.002, 
'omnibus_sign': True, 
'te': array([0.10093138, 0.12595904])}

I am still pondering two main questions: (1) Whether it makes sense to include a minimum lag of zero? Theoretically, it does make sense (as the samples come from 4 observations per day for 14 days), but I don't know whether it does computationally speaking. Some values of omnibus transfer entropy reach >0.4. (2) Which max lag value is best? I am thinking of plotting mutual information as a function of lag (as suggested by Patricia here ) to determine the best maximum lag. Though I suspect this could lead to a max lag of 56 as in psychology, any self-reported response has some part of stable information.

Additionally, (3) Do you have any recommendations for visualizing the transfer entropy results in addition to plotting? Ideally, I would like to plot a network of weighted and directed edges. I have read your documentation, but I am unfamiliar with python, so I might have missed an option you provide. For instance, the plot_network function plots networks, but no option is provided to use TE values. I guess my best option is to extract the TE values to R and then plot a weighted network in R (as I am more comfortable in R code than python) but I haven't thought out yet how to best reflect the important nuance highlighted in this passage: "Key Idea 39: Iterative or greedy approaches with conditional transfer entropy infer an effective network in which a directed link indicates that the source is a parent of the target, in conjunction with the other parent nodes. It does not necessarily imply that a parent source provides any unique directed pairwise information to the target. " (Bossomaier et al., 2016, p. 148)

(4) I also have some misunderstandings regarding the interpretation of the values of transfer entropy and normalizing the values to know how much information the included sources provide about the target beyond the target's past. I've read chapters 4.5.2 and 7.2 of Bossomaier et al. (2016), but I am not sure I understand: as I used the kraskov estimator, are the TE values already normalized? If so, are they isomorphic to explained variance (R²) ? If not, how can I best compute the normalized values?

Please excuse me for this rather long message; I really appreciate your answers so far, and our exchange allows me to potentially get answers to questions that might otherwise be left unanswered (none of my colleagues are using information theory, so I'm learning by myself and with resources such as Pr. Lizier's youtube playlist and further readings).

Best, Yorgo Hoebeke

Rhydderch commented 1 year ago

An update regarding the above questions: (1) I included the lag zero as it makes sense theoretically and considering the lag between measurements. (2) I computed autocorrelation and auto-mutual-information for each variable (with JIDT, it was easy to do thanks to the lag parameter). I settled on a max lag of 19 (i.e., 5 days' worth of measurements) as AMI plateaued around that time. (3) In addition to the plots provided by the IDTxl package, I created a network visualization representing the transfer entropy from each source to each target, the omnibus transfer entropy for each target, and the active information storage information (computed separately). (4) I don't normalize entropy as I understood there is currently no way to do this when using differential entropy.

mwibral commented 1 year ago

Hi Yorgo,

thanks for the update. Let us know whenever you need more assistance.

Best,

Michael


From: Yorgo @.***> Sent: Saturday, November 19, 2022 11:42:41 AM To: pwollstadt/IDTxl Cc: Wibral, Michael; Comment Subject: Re: [pwollstadt/IDTxl] NumbaCudaKraskovCMI not found (Issue #87)

An update regarding the above questions: (1) I included the lag zero as it makes sense theoretically and considering the lag between measurements. (2) I computed autocorrelation and auto-mutual-information for each variable (with JIDT, it was easy to do thanks to the lag parameter). I settled on a max lag of 19 (i.e., 5 days' worth of measurements) as AMI plateaued around that time. (3) In addition to the plots provided by the IDTxl package, I created a network visualization representing the transfer entropy from each source to each target, the omnibus transfer entropy for each target, and the active information storage information (computed separately). (4) I don't normalize entropy as I understood there is currently no way to do this when using differential entropy.

— Reply to this email directly, view it on GitHubhttps://github.com/pwollstadt/IDTxl/issues/87#issuecomment-1320856917, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACFJQGUMQ4UKPKKWTLR6IJ3WJCVKDANCNFSM6AAAAAARPBIBG4. You are receiving this because you commented.Message ID: @.***>