Multi-core? - Githubissues

Hello !

Yesterday I installed IDTxl and it looks awesome. The documentation and wiki are really nice, especially the description of mTE algorithm. I was playing with the mte example (by the way, sphinx-gallery could be a nice option for your documentation). Example run smoothly, however, on a subset of real data (e.g (n_processes, n_samples, n_replications)=(6, 1000, 10)) it started to be really slow and it's almost impossible to test it on my full dataset (104, 1000, 825 for 17 subjects)

I want to run it on a super computer, but the multi-core doesn't seems to work. In addition, is there some settings properties so I can set to speed up computations?

Thanx !

Hi Etienne,

Thanks for the compliment.

If you're going to open this as an issue rather than an email request for help, then you're going to need to provide some more details on what you mean by "the multi-core doesn't seem to work" (this is the most concrete statement of an "issue" I can find here). Running more slowly than the example doesn't mean it's not working, this may simply be the way the algorithms scale. E.g. (single-threaded) runtime will scale ~O(n log n) with number of samples, which are 1000x10 here, ~O(n^2) in the number of processes. (Even for that size data, we'd be suggesting for putting it on a cluster and running for each target node in parallel. Your real data set is very, very large; larger than the largest full test we've run, so you might want to try sub-sampling or using an order of magnitude less repeats, etc.).

In any case, my gut feeling is that it's your expectation that isn't being met rather than the multi-core not working, unless you can provide some specific info about that. You could try changing the number of cores you're giving it access to and seeing what changes; but do bear in mind there that there are significant overheads for the multi-core on smaller data sets. So, please send more specific information or else close off the issue.

--joe

On Sat, 11 Aug 2018 at 22:55, Etienne Combrisson notifications@github.com wrote:

Hello !

Yesterday I installed IDTxl and it looks awesome. The documentation and wiki are really nice, especially the description of mTE algorithm. I was playing with the mte example (by the way, sphinx-gallery https://sphinx-gallery.readthedocs.io/en/latest/ could be a nice option for your documentation). Example run smoothly, however, on a subset of real data (e.g (n_processes, n_samples, n_replications)=(6, 1000, 10)) it started to be really slow and it's almost impossible to test it on my full dataset (104, 1000, 825 for 17 subjects)

I want to run it on a super computer, but the multi-core doesn't seems to work. In addition, is there some settings properties so I can set to speed up computations?

Thanx !

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pwollstadt/IDTxl/issues/17, or mute the thread https://github.com/notifications/unsubscribe-auth/ABirw37D_uMmxUYdMMV3ZRvs5WOQbuw2ks5uPtQugaJpZM4V5KG4 .

Hi Etienne,

multi-core parallelism should work out of the box using the multithreaded java implementation of the estimators (jidt-...). Check processor load with top or some tool like that. Multi-node parallelism (e.g. in a cluster) has to be implemented by hand, i.e. using estimates for single targets that are distributed via the Cluster queue (e.g. SGE, SLURM, Torque, etc.) and then piecing things together by collecting the results. This is on purpose, because breaking up the compute task this way into smaller pieces lowers wait times on most Supercomputer queues. Parallelization across subjects "works" the same way.

This said, these computations simply do take a lot of time. I am working mostly with continuous data (MEG, EEG source data, 5-7nodes, 5-10 delays into the past, spaced by some 3-5 ms to roughly cover an alpha cycle) and the computations of the relevant TE sources into a single target can take up to a few days on a multicore CPU. For your data size, especially in terms of nodes, I think splitting the computation into computations for single target nodes is already a must.

GPUs may be an alternative: I have no really up-too-date hardware comparison data, but the speedup provided by using an nvidia Titan (1st gen.) over a single Xeon core (Sandy Bridge generation) is roughly 10-fold. Performance per Dollar has slight advantage for the GPU solution, and often Clusters simply have way more GPU than CPU compute these days.

Best,

Michael

On 11.08.2018 14:55, Etienne Combrisson wrote:

Hello !

Yesterday I installed IDTxl and it looks awesome. The documentation and wiki are really nice, especially the description of mTE algorithm. I was playing with the mte example (by the way, sphinx-gallery https://sphinx-gallery.readthedocs.io/en/latest/ could be a nice option for your documentation). Example run smoothly, however, on a subset of real data (e.g |(n_processes, n_samples, n_replications)=(6, 1000, 10)|) it started to be really slow and it's almost impossible to test it on my full dataset (|104, 1000, 825| for 17 subjects)

I want to run it on a super computer, but the multi-core doesn't seems to work. In addition, is there some settings properties so I can set to speed up computations?

Thanx !

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pwollstadt/IDTxl/issues/17, or mute the thread https://github.com/notifications/unsubscribe-auth/AIqYGm2xMHvdXTjjc42h8ZvAOazLVeYvks5uPtQugaJpZM4V5KG4.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/pwollstadt/IDTxl","title":"pwollstadt/IDTxl","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/pwollstadt/IDTxl"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Multi-core? (#17)"}],"action":{"name":"View Issue","url":"https://github.com/pwollstadt/IDTxl/issues/17"}}} [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/pwollstadt/IDTxl/issues/17", "url": "https://github.com/pwollstadt/IDTxl/issues/17", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Multi-core? (#17)", "sections": [ { "text": "", "activityTitle": "Etienne Combrisson", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@EtienneCmb", "facts": [ { "name": "Repository: ", "value": "pwollstadt/IDTxl" }, { "name": "Issue #: ", "value": 17 } ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"pwollstadt/IDTxl\",\n\"issueId\": 17,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"pwollstadt/IDTxl\",\n\"issueId\": 17\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/pwollstadt/IDTxl/issues/17" } ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 367305144\n}" } ], "themeColor": "26292E" } ]

Hi Etienne,

after seeing pwollstadt's reply, I had a look at your dataset size again. 104 nodes in the network may be too large to analyze. Any chance you could reduce that number in a meaningful way? I am asking not only because of the time it will take to compute this, but also because of the ensuing "problems" with corrections for multiple comparisons.

Best, Michael

On 11.08.2018 14:55, Etienne Combrisson wrote:

Hello !

Yesterday I installed IDTxl and it looks awesome. The documentation and wiki are really nice, especially the description of mTE algorithm. I was playing with the mte example (by the way, sphinx-gallery https://sphinx-gallery.readthedocs.io/en/latest/ could be a nice option for your documentation). Example run smoothly, however, on a subset of real data (e.g |(n_processes, n_samples, n_replications)=(6, 1000, 10)|) it started to be really slow and it's almost impossible to test it on my full dataset (|104, 1000, 825| for 17 subjects)

I want to run it on a super computer, but the multi-core doesn't seems to work. In addition, is there some settings properties so I can set to speed up computations?

Thanx !

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/pwollstadt/IDTxl/issues/17, or mute the thread https://github.com/notifications/unsubscribe-auth/AIqYGm2xMHvdXTjjc42h8ZvAOazLVeYvks5uPtQugaJpZM4V5KG4.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/pwollstadt/IDTxl","title":"pwollstadt/IDTxl","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/pwollstadt/IDTxl"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Multi-core? (#17)"}],"action":{"name":"View Issue","url":"https://github.com/pwollstadt/IDTxl/issues/17"}}} [ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/pwollstadt/IDTxl/issues/17", "url": "https://github.com/pwollstadt/IDTxl/issues/17", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Multi-core? (#17)", "sections": [ { "text": "", "activityTitle": "Etienne Combrisson", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@EtienneCmb", "facts": [ { "name": "Repository: ", "value": "pwollstadt/IDTxl" }, { "name": "Issue #: ", "value": 17 } ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"pwollstadt/IDTxl\",\n\"issueId\": 17,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"pwollstadt/IDTxl\",\n\"issueId\": 17\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/pwollstadt/IDTxl/issues/17" } ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 367305144\n}" } ], "themeColor": "26292E" } ]

Hi all,

A couple of extra comments from me.

First, although we didn't say this explicitly, pretty much all of @mwibral and my comments re auto multi-threading (i.e. per estimator, not the per node parallelisation), computation time and complexity refer to the Kraskov (KSG) estimators. That's really because it is the default estimator in our heads. If you try to bin/discretise the data, or use the Gaussian estimator, they do not provide multi-threading, but on the other hand scale as O(n) in number of samples so are much faster anyway.

So on that note, you may wish to consider these alternate estimators if you want to generate some fast results for your data sets. They will be much, much faster than KSG; though this comes at the expense of accuracy in the information-theoretic estimates. With that said, they will give a useful first-order analysis. If you try this, you will need to remember that each estimator will run single-threaded, so you can run hand-crafted multi-node parallelisation with as many processes as you have cores available in this case.

Also, those faster estimators also permit an analytic calculation of null/surrogate distributions for computing p-values, which make them much faster again. (for internal discussion: @pwollstadt / Leo -- this was implemented at one stage but I can't see in documentation nor code how to switch it on anymore -- has this been removed?)

--joe Analytic null

Hi all,

maybe Leo could say something about the feasibility of estimating mTE in a network with 107 nodes using the linear (Gaussian) approximation and the analytic null.

Michael

On 13.08.2018 12:34, Joseph Lizier wrote:

Hi all,

A couple of extra comments from me.

First, although we didn't say this explicitly, pretty much all of @mwibral https://github.com/mwibral and my comments re auto multi-threading (i.e. per estimator, not the per node parallelisation), computation time and complexity refer to the Kraskov (KSG) estimators. That's really because it is the default estimator in our heads. If you try to bin/discretise the data, or use the Gaussian estimator, they do not provide multi-threading, but on the other hand scale as O(n) in number of samples so are much faster anyway.

So on that note, you may wish to consider these alternate estimators if you want to generate some fast results for your data sets. They will be much, much faster than KSG; though this comes at the expense of accuracy in the information-theoretic estimates. With that said, they will give a useful first-order analysis. If you try this, you will need to remember that each estimator will run single-threaded, so you can run hand-crafted multi-node parallelisation with as many processes as you have cores available in this case.

Also, those faster estimators also permit an analytic calculation of null/surrogate distributions for computing p-values, which make them much faster again. (for internal discussion: @pwollstadt https://github.com/pwollstadt / Leo -- this was implemented at one stage but I can't see in documentation nor code how to switch it on anymore -- has this been removed?)

--joe Analytic null

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pwollstadt/IDTxl/issues/17#issuecomment-412476401, or mute the thread https://github.com/notifications/unsubscribe-auth/AIqYGpcrz47eobUIN1eRdDgCU96_sA4Rks5uQVY3gaJpZM4V5KG4.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/pwollstadt/IDTxl","title":"pwollstadt/IDTxl","subtitle":"GitHub repository","main_image_url":"https://assets-cdn.github.com/images/email/message_cards/header.png","avatar_image_url":"https://assets-cdn.github.com/images/email/message_cards/avatar.png","action":{"name":"Open in GitHub","url":"https://github.com/pwollstadt/IDTxl"}},"updates":{"snippets":[{"icon":"PERSON","message":"@jlizier in #17: Hi all,\r\n\r\nA couple of extra comments from me.\r\n\r\nFirst, although we didn't say this explicitly, pretty much all of @mwibral and my comments re auto multi-threading (i.e. per estimator, not the per node parallelisation), computation time and complexity refer to the Kraskov (KSG) estimators. That's really because it is the default estimator in our heads. If you try to bin/discretise the data, or use the Gaussian estimator, they do not provide multi-threading, but on the other hand scale as O(n) in number of samples so are much faster anyway.\r\n\r\nSo on that note, you may wish to consider these alternate estimators if you want to generate some fast results for your data sets. They will be much, much faster than KSG; though this comes at the expense of accuracy in the information-theoretic estimates. With that said, they will give a useful first-order analysis. If you try this, you will need to remember that each estimator will run single-threaded, so you can run hand-crafted multi-node parallelisation with as many processes as you have cores available in this case.\r\n\r\nAlso, those faster estimators also permit an analytic calculation of null/surrogate distributions for computing p-values, which make them much faster again.\r\n(for internal discussion: @pwollstadt / Leo -- this was implemented at one stage but I can't see in documentation nor code how to switch it on anymore -- has this been removed?)\r\n\r\n--joe\r\nAnalytic null"}],"action":{"name":"View Issue","url":"https://github.com/pwollstadt/IDTxl/issues/17#issuecomment-412476401"}}}[ { "@context": "http://schema.org", "@type": "EmailMessage", "potentialAction": { "@type": "ViewAction", "target": "https://github.com/pwollstadt/IDTxl/issues/17#issuecomment-412476401", "url": "https://github.com/pwollstadt/IDTxl/issues/17#issuecomment-412476401", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { "@type": "Organization", "name": "GitHub", "url": "https://github.com" } }, { "@type": "MessageCard", "@context": "http://schema.org/extensions", "hideOriginalBody": "false", "originator": "AF6C5A86-E920-430C-9C59-A73278B5EFEB", "title": "Re: [pwollstadt/IDTxl] Multi-core? (#17)", "sections": [ { "text": "", "activityTitle": "Joseph Lizier", "activityImage": "https://assets-cdn.github.com/images/email/message_cards/avatar.png", "activitySubtitle": "@jlizier", "facts": [ ] } ], "potentialAction": [ { "name": "Add a comment", "@type": "ActionCard", "inputs": [ { "isMultiLine": true, "@type": "TextInput", "id": "IssueComment", "isRequired": false } ], "actions": [ { "name": "Comment", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueComment\",\n\"repositoryFullName\": \"pwollstadt/IDTxl\",\n\"issueId\": 17,\n\"IssueComment\": \"{{IssueComment.value}}\"\n}" } ] }, { "name": "Close issue", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"IssueClose\",\n\"repositoryFullName\": \"pwollstadt/IDTxl\",\n\"issueId\": 17\n}" }, { "targets": [ { "os": "default", "uri": "https://github.com/pwollstadt/IDTxl/issues/17#issuecomment-412476401" } ], "@type": "OpenUri", "name": "View on GitHub" }, { "name": "Unsubscribe", "@type": "HttpPOST", "target": "https://api.github.com", "body": "{\n\"commandName\": \"MuteNotification\",\n\"threadId\": 367305144\n}" } ], "themeColor": "26292E" } ]

Hi all !

@jlizier Indeed, this might be not an issue but I thought that a public discussion might be interesting for others. Feel free to close the issue.

@mwibral I tried the Gaussian CMI estimator and indeed, computations are much faster. And yes, I'm probably going to decimate my data using downsampling and compute on a reduced number of sources.

I take a look at the MultivariateTE.analyse_network method which basically loop over targets and run the analyse_single_target method. Might be interesting to use something like joblib to be able to use parallel computing for this loop?

Thanks,

I made a quick test using joblib on the MultivariateTE.analyse_network method. On my machine, the mTE example is computed in ~30seconds. Using joblib on my 8 cores machine it takes ~15seconds so I think that for larger datasets with a larger number of sources it might be interesting.

Alos, if you don't want an other dependency, it's also possible to have something like this :

if joblib_is_installed():
    # Compute using parallel computing
else:
    # Single thread

Hi Etienne,

to give you an idea, analysing a single target with the Gaussian CMI estimator (on a sizeable machine with 96 cores) takes about 5 minutes for a sparse network of 100 nodes, 10k samples, and 1 replication (which is one order of magnitude lower than the number of samples*replications in your dataset).

Thank you for the suggestion about joblib, we'll include an example in the Wiki on how to parallelise over targets but leave it to the users to use their favourite libraries (I used SCOOP for example).

Best, Leo

Thanks @LNov for the estimation. Can you please let know once the example is added to the wiki?

Hi all, thanks for the discussion. I will close this issue. In case you want to talk about this topic further, I opened a google group for discussions on usage and other topics related to IDTxl.

A tutorial on how to parallelise the analysis over targets using a computing cluster is now available in the Wiki. The example is specific to the PBS Job Scheduler but can be used as a template and adapted to work with other job scheduling systems:

https://github.com/pwollstadt/IDTxl/wiki/Parallel-Analysis-Using-PBS-Job-Scheduler

pwollstadt / IDTxl

Multi-core? #17