LESYMAP results stable when run locally, vary on compute cluster

Aaron-Boes commented 4 years ago

We are noticing that repeated runs of LESYMAP using the same input (lesion masks and behavioral data) produces slightly different results each time when run on a high performance computing cluster but produces identical, stable results each time it is run locally. The packages are identical in both environments, copied below.

ANTsR, 0.5.6.0.0, 2019-11-06 ANTsRCore, 0.7.4.6, 2020-04-12 ITKR, 0.5.3.2.0, 2020-03-04 LESYMAP, 0.0.0.9221, 2019-07-30

When the results differ on the HPC cluster it occurs as early as the first sparseness check with differing CV, cost values, then differing results from that point on. The results are similar but can vary in terms of resulting in a significant map versus non-significant.

Is there a reason why the results would differ between these two ways of running LESYMAP? Is there a way we can verify which is the intended, valid result?

Thanks!

dorianps commented 4 years ago

Hmmm, I am trying to think what might cause this. If the cv cost values are changing, maybe it is the way the sample is split for cross-validations. Lesymap normally repeats the cross validations 3-6 times to avoid this problem (depending on the sample size). You can try to manually set cvRepetitions=10 which will be slower, but should hopefully mitigate unusual bad splits. From what you say, looks also like the results could be borderline significant, which makes me think the 4-fold validation is eating up useful signal because the search is always performed only on 75% of the subjects. Maybe you can have more folds, something like nFolds = 6, etc.

If you have a chance, please paste the search that is happening in local and hpc, to see how bad is the deviance. Curious if the resulting sparseness is very different between the two.

brussj commented 4 years ago

Dorian-

Someone else will add more data, most likely, but I have these results (same dataset) from duplicate runs on a cluster. Both of my runs came back as null, but this same dataset sometimes produces a significant result. Here is the output as it ran:

first run 10:50:47 Running analysis: sccan ... Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 10:50:54 Checking sparseness -0.212 . . . CV correlation 0.147 (0.538) (cost=0.859) 11:50:19 Checking sparseness 0.212 . . . CV correlation 0.0548 (0.381) (cost=0.952) 12:14:16 Checking sparseness -0.475 . . . CV correlation 0.158 (0.549) (cost=0.857) 13:18:43 Checking sparseness -0.361 . . . CV correlation 0.152 (0.545) (cost=0.859) 14:24:13 Checking sparseness -0.637 . . . CV correlation 0.150 (0.565) (cost=0.869) 15:33:24 Checking sparseness -0.537 . . . CV correlation 0.153 (0.561) (cost=0.863) 16:40:28 Checking sparseness -0.431 . . . CV correlation 0.159 (0.546) (cost=0.854) 17:51:34 Checking sparseness -0.421 . . . CV correlation 0.159 (0.546) (cost=0.853) 18:58:14 Checking sparseness -0.398 . . . CV correlation 0.157 (0.547) (cost=0.855) 20:00:35 Checking sparseness -0.411 . . . CV correlation 0.159 (0.546) (cost=0.854) 21:02:47 Checking sparseness -0.421 . . . CV correlation 0.159 (0.546) (cost=0.853) Found optimal sparsenes -0.421 (CV corr=0.159 p=0.0558) WARNING: Poor cross-validated accuracy, returning NULL result.

second run 10:50:47 Running analysis: sccan ... Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 10:50:54 Checking sparseness -0.212 . . . CV correlation 0.139 (0.535) (cost=0.867) 11:43:09 Checking sparseness 0.212 . . . CV correlation 0.0416 (0.433) (cost=0.965) 12:07:08 Checking sparseness -0.475 . . . CV correlation 0.145 (0.558) (cost=0.869) 13:01:14 Checking sparseness -0.335 . . . CV correlation 0.139 (0.547) (cost=0.871) 13:55:56 Checking sparseness -0.05 . . . CV correlation 0.151 (0.511) (cost=0.851) 14:47:11 Checking sparseness 0.05 . . . CV correlation 0.0461 (0.415) (cost=0.955) 15:18:53 Checking sparseness -0.112 . . . CV correlation 0.127 (0.532) (cost=0.876) 16:11:30 Checking sparseness -0.012 . . . CV correlation 0.0914 (0.426) (cost=0.909) 17:03:17 Checking sparseness -0.074 . . . CV correlation 0.121 (0.516) (cost=0.881) 17:55:54 Checking sparseness -0.036 . . . CV correlation 0.138 (0.521) (cost=0.863) 18:47:30 Checking sparseness -0.06 . . . CV correlation 0.140 (0.519) (cost=0.862) 19:39:50 Checking sparseness -0.05 . . . CV correlation 0.151 (0.511) (cost=0.851) Found optimal sparsenes -0.05 (CV corr=0.151 p=0.0705) WARNING: Poor cross-validated accuracy, returning NULL result.

dorianps commented 4 years ago

Yes, looks like the first run ended up with better CV splits even at the same identical sparseness values. So I think it's a good idea to increase cvRepetitions, which will increase the processing time linearly. The cv instability you show above seems to indicate an issue with models fitting some subjects, so I am not sure increasing nFolds will help, but you can try that, too.

This behavior you are mapping seems tough to pinpoint despite the number of subjects. But this depends on the nature and extent of lesions, too. Out of curiosity, can you share what behavioral score is this? Also, does univariate give you any reasonable outcome?

Fatimah-A commented 4 years ago

Hi Dorian,

Here is my runs locally and HPC. This is the same test that Brussj posted and it's WRAT-4 Reading Comprehension (N =145). The univariate tests came back with null findings both locally and on HPC. This is just one example. I've tried 3-4 other tests that behaved the same way as this one when run locally vs. HPC; giving identical results for multiple runs on local machines and varied results on HPC for multiple runs and none of these results is identical to the findings on the local machines.

Local – Trial 1 13:07:16 Checking sparseness -0.212 . . . CV correlation 0.181 (0.469) (cost=0.825) 13:49:59 Checking sparseness 0.212 . . . CV correlation 0.0662 (0.312) (cost=0.940) 14:06:21 Checking sparseness -0.475 . . . CV correlation 0.169 (0.492) (cost=0.845) 14:49:18 Checking sparseness -0.269 . . . CV correlation 0.176 (0.478) (cost=0.832) 15:33:44 Checking sparseness -0.05 . . . CV correlation 0.161 (0.440) (cost=0.840) 16:16:12 Checking sparseness -0.179 . . . CV correlation 0.178 (0.465) (cost=0.828) 17:00:32 Checking sparseness -0.202 . . . CV correlation 0.175 (0.467) (cost=0.831) 17:44:43 Checking sparseness -0.234 . . . CV correlation 0.179 (0.476) (cost=0.828) 18:29:12 Checking sparseness -0.222 . . . CV correlation 0.186 (0.475) (cost=0.820) 19:13:12 Checking sparseness -0.222 . . . CV correlation 0.186 (0.475) (cost=0.820) Found optimal sparsenes -0.222 (CV corr=0.186 p=0.0247) Local – Trial 2 13:08:19 Checking sparseness -0.212 . . . CV correlation 0.181 (0.469) (cost=0.825) 13:51:11 Checking sparseness 0.212 . . . CV correlation 0.0662 (0.312) (cost=0.940) 14:07:37 Checking sparseness -0.475 . . . CV correlation 0.169 (0.492) (cost=0.845) 14:50:48 Checking sparseness -0.269 . . . CV correlation 0.176 (0.478) (cost=0.832) 15:35:04 Checking sparseness -0.05 . . . CV correlation 0.161 (0.440) (cost=0.840) 16:17:11 Checking sparseness -0.179 . . . CV correlation 0.178 (0.465) (cost=0.828) 17:01:17 Checking sparseness -0.202 . . . CV correlation 0.175 (0.467) (cost=0.831) 17:45:28 Checking sparseness -0.234 . . . CV correlation 0.179 (0.476) (cost=0.828) 18:29:49 Checking sparseness -0.222 . . . CV correlation 0.186 (0.475) (cost=0.820) 19:13:34 Checking sparseness -0.222 . . . CV correlation 0.186 (0.475) (cost=0.820) Found optimal sparsenes -0.222 (CV corr=0.186 p=0.0247) HPC – Trial 1 10:00:11 Checking sparseness -0.212 . . . CV correlation 0.200 (0.488) (cost=0.807) 10:48:00 Checking sparseness 0.212 . . . CV correlation 0.0975 (0.326) (cost=0.909) 11:04:40 Checking sparseness -0.475 . . . CV correlation 0.173 (0.486) (cost=0.841) 11:54:06 Checking sparseness -0.222 . . . CV correlation 0.210 (0.485) (cost=0.797) 12:45:24 Checking sparseness -0.329 . . . CV correlation 0.197 (0.489) (cost=0.813) 13:33:25 Checking sparseness -0.263 . . . CV correlation 0.199 (0.489) (cost=0.809) 14:21:27 Checking sparseness -0.237 . . . CV correlation 0.206 (0.487) (cost=0.801) 15:09:39 Checking sparseness -0.222 . . . CV correlation 0.210 (0.485) (cost=0.797) Found optimal sparsenes -0.222 (CV corr=0.21 p=0.0114) HPC – Trial 2 13:46:11 Checking sparseness -0.212 . . . CV correlation 0.0939 (0.471) (cost=0.913) 14:30:20 Checking sparseness 0.212 . . . CV correlation 0.0536 (0.329) (cost=0.953) 14:46:35 Checking sparseness -0.475 . . . CV correlation 0.0693 (0.454) (cost=0.945) 15:31:27 Checking sparseness -0.149 . . . CV correlation 0.0963 (0.471) (cost=0.908) 16:15:19 Checking sparseness -0.011 . . . CV correlation 0.0806 (0.407) (cost=0.920) 16:58:12 Checking sparseness -0.136 . . . CV correlation 0.0834 (0.460) (cost=0.921) 17:42:09 Checking sparseness -0.159 . . . CV correlation 0.0939 (0.473) (cost=0.911) 18:26:08 Checking sparseness -0.149 . . . CV correlation 0.0963 (0.471) (cost=0.908) Found optimal sparsenes -0.149 (CV corr=0.096 p=0.249)

Thanks!

dorianps commented 4 years ago

Hi @Fatimah-A This looks again like an issue with k-fold splitting. I had a quick look around and may need to dig in further to make sure the splits are different and random each time CV is happening. This will take me some time, but meanwhile I wanted to ask a few questions that might inform better on this issue.

We use a function borrowed by caret to split the folds. That function relies on stratification of folds based on the behavioral variable, that is, we want to make sure that each fold is not fully random but has as much as possible of the signal range available in the full dataset. If the fold selection is random, it can happen that all subjects with "bad" behavior are in a fold together, and the training is not representative of the real cases that should be expected in test. Maybe a problem can be when the behavioral score is not heterogeneous enough, i.e., it has many identical scores. Looks like a somewhat similar issue is reported here. So, the question here becomes what is the nature of the behavioral variable, is it heterogeneous enough, does it have many repeated values, does it have a good histogram distribution?
Do you have the same outcome locally when you restart R (or the computer) or do you see it only when you repeat the analysis in the same R session? I am asking this because I suspect there might be something wrong with the random seed used to split in folds. Maybe that is stuck during the same session, which is why the local run is always identical. I will need to check this myself, too, but for now maybe your experience can give some hints.
One thing to try is what I suggested above, i.e., increase cvrepetitions=10. This will increase the processing time a lot (which is already long in your case), but we can see if that brings more alignment between local and hpc runs.

brussj commented 4 years ago

Dorian-

Attached is a really bad pic of the distribution of the scores, but it gives you an idea of the range and distribution. I've not run these, myself, locally, I always just batch submit them to our high performance cluster. A simple bash wrapper that loads the R module, then runs the Rscript that has the lesymap setup and execution. I could be hazy on how the cluster works but I think that each submission gets farmed out to it's own node(s) and is agnostic to other submissions on other nodes, but I could be wrong.

Fatimah has done much more of these analsyes and will be better at giving you answers to all the different ways she ran this and the output she received. If you need it, I could always post the lesion matrix and behavior scores, just let me know. If you do need them, just contact me privately with an email and I could send you a link to a OneDrive folder.

-Joel

WRAT_scores

dorianps commented 4 years ago

Looks like there are many identical scores, not sure how createFolds deals with it. More than the lesion matrix, it may be useful for me to have this vector of behavior scores. Let's see if increasing cvrepetitions helps in some way, I have limited time to troubleshoot this for another couple of weeks.

brussj commented 4 years ago

Here are the scores. ID's changed to 1-145, but that shouldn't matter.

WRAT_scores.txt

Fatimah-A commented 4 years ago

Hi Dorian,

1) I guess Brussj may have taken care of this but this is not the only test that behaved this way. Please let me know if you want other datasets. This happened to be the test that was used to test different methods (in answer #3) but not the only one tested. 2) I ran my tests in RStudio in different sessions and also using R-scripts and the results are always identical. By locally we mean our department's computers (not local & not HPC but much higher ram than my local desktop/laptop). I ran the tests not only in different RStudio sessions but also on different machines but always got identical results. There is only one time that I ran a test in the same RStudio session and got different findings than my first run (both are null findings). A third run in a new session on a different machine of the same test was identical to the first run. I don't have the sparseness check values but this is from the info.txt file: Run 1 and 3: optimalSparseness: -0.00774664161363611 CVcorrelation.stat: 0.0430064757281513 CVcorrelation.pval: 0.256498392466321 Run 2: optimalSparseness: -0.0665048631138476 CVcorrelation.stat: 0.0556642694649579 CVcorrelation.pval: 0.141795879149721

This is the other very minor variation that I noticed (which may not be an issue) when running Boston Naming Test multiple times locally on different machines (the resulted maps are 100% identical, and all other variables are identical as well): Boston Naming Run 1 and 2: CVcorrelation.stat: 0.523175241299837 CVcorrelation.pval: 7.38520466215419e-45 Boston Naming Run 3: CVcorrelation.stat: 0.523175241299838 CVcorrelation.pval: 7.38520466215397e-45 Boston Naming HPC run (this has 4 clusters while locally it has 3 clusters; also different statistical range and everything else is different than the local run): CVcorrelation.stat: 0.543431266392423 CVcorrelation.pval: 6.30157657335495e-49

3) This is running in the background since yesterday still not done... I did 2 runs on HPC but haven't run it locally. Do you want me to run it locally too?! This is what we have so far for the 2 identical tests running now (the same test reported on my previous post): Setting cvRepetitions =10, Trial 1: Checking sparseness -0.212 . . . . . . . . . . CV correlation 0.154 (0.470) (cost=0.852) Checking sparseness 0.212 . . . . . . . . . . CV correlation 0.0481 (0.305) (cost=0.958) Checking sparseness -0.475 . . . . . . . . . . CV correlation 0.135 (0.465) (cost=0.879) Checking sparseness -0.244 . . . . . . . . . . CV correlation 0.148 (0.469) (cost=0.859) Checking sparseness -0.05 . . . . . . . . . . CV correlation 0.135 (0.445) (cost=0.866) Checking sparseness -0.159 . . . . . . . . . . CV correlation 0.151 (0.472) (cost=0.854)

Setting cvRepetitions =10, Trial 2: Checking sparseness -0.212 . . . . . . . . . . CV correlation 0.107 (0.490) (cost=0.899) Checking sparseness 0.212 . . . . . . . . . . CV correlation 0.061 (0.330) (cost=0.945) Checking sparseness -0.475 . . . . . . . . . . CV correlation 0.0855 (0.486) (cost=0.929) Checking sparseness -0.169 . . . . . . . . . . CV correlation 0.114 (0.492) (cost=0.892) Checking sparseness -0.023 . . . . . . . . . . CV correlation 0.123 (0.420) (cost=0.878) Checking sparseness 0.022 . . . . . . . . . . CV correlation 0.0799 (0.334) (cost=0.921) Checking sparseness -0.079 . . . . . . . . . . CV correlation 0.115 (0.481) (cost=0.887)

Setting cvRepetitions =10 and nFolds = 6, Trial 1: Checking sparseness -0.212 . . . . . . . . . . CV correlation 0.156 (0.464) (cost=0.851) Checking sparseness 0.212 . . . . . . . . . . CV correlation 0.0375 (0.296) (cost=0.969) Checking sparseness -0.475 . . . . . . . . . . CV correlation 0.133 (0.465) (cost=0.882) Checking sparseness -0.241 . . . . . . . . . . CV correlation 0.152 (0.459) (cost=0.855) Checking sparseness -0.05 . . . . . . . . . . CV correlation 0.164 (0.466) (cost=0.838)

Setting cvRepetitions =10 and nFolds = 6, Trial 2: Checking sparseness -0.212 . . . . . . . . . . CV correlation 0.158 (0.468) (cost=0.848) Checking sparseness 0.212 . . . . . . . . . . CV correlation 0.0517 (0.329) (cost=0.955) Checking sparseness -0.475 . . . . . . . . . . CV correlation 0.145 (0.470) (cost=0.869) Checking sparseness -0.261 . . . . . . . . . . CV correlation 0.154 (0.475) (cost=0.854) Checking sparseness -0.05 . . . . . . . . . . CV correlation 0.146 (0.446) (cost=0.856)

Each one of these tests is in a different R session.

Thanks & I hope this helps!

dorianps commented 4 years ago

Can you also paste the outcome of library(LESYMAP) ; sessionInfo() to know what R/packages are installed.

Fatimah-A commented 4 years ago

Local: R version 3.6.0 (2019-04-26) Platform: x86_64-redhat-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] LESYMAP_0.0.0.9221 ANTsR_0.5.6.0.0 ANTsRCore_0.7.4.6

loaded via a namespace (and not attached): [1] compiler_3.6.0 RcppEigen_0.3.3.7.0 magrittr_1.5
[4] Matrix_1.2-17 lmPerm_2.1.0 tools_3.6.0
[7] ITKR_0.5.3.2.0 Rcpp_1.0.4.6 grid_3.6.0
[10] lattice_0.20-38

HPC: R version 3.5.1 (2018-07-02) Platform: x86_64-pc-linux-gnu (64-bit) Running under: CentOS Linux 7 (Core)

Matrix products: default BLAS/LAPACK: /opt/apps/parallel_studio/2017.4/compilers_and_libraries_2017.4.196/linux/mkl/lib/intel64_lin/libmkl_gf_lp64.so

locale: [1] C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] LESYMAP_0.0.0.9221 ANTsR_0.5.6.0.0 ANTsRCore_0.7.4.6

loaded via a namespace (and not attached): [1] compiler_3.5.1 RcppEigen_0.3.3.7.0 magrittr_1.5
[4] Matrix_1.2-14 lmPerm_2.1.0 tools_3.5.1
[7] ITKR_0.5.3.2.0 Rcpp_1.0.4.6 grid_3.5.1
[10] packrat_0.4.9-3 lattice_0.20-35

dorianps commented 4 years ago

@stnava Looks like there are some issues with the stability of my sparseness optimization routine. I think this may be related to the cross-validation splits which are stratified using a caret function as we used to do together some years ago. The behavior score is also unusual here, with many repeated numbers, and this may be a combination of random seed not being random enough in local runs. I will need to dig if this is the case, but before doing that I wanted to ask you if SCCAN itself has any random component in its internal iterations which might explain different convergence on different computers? You know, just excluding that SCCAN results can vary with the same input, so I can focus on the CV issue.

brussj commented 4 years ago

Dorian-

Perhaps not needed anymore but here are my settings for the build of LESYMAP that I'm using on the cluster compute. Everything is the same as Fatimah's except:

locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] LESYMAP_0.0.0.9221 ANTsR_0.5.1 ANTsRCore_0.7.0

loaded via a namespace (and not attached): [1] compiler_3.5.1 RcppEigen_0.3.3.5.0 magrittr_1.5
[4] Matrix_1.2-14 lmPerm_2.1.0 tools_3.5.1
[7] ITKR_0.5.0 Rcpp_1.0.2 grid_3.5.1
[10] lattice_0.20-35

dorianps commented 4 years ago

@Fatimah-A how are those nFolds=6 runs going? It looked like results were more similar with 6 folds. Do you want to give a try with nFolds=10, cvRepetitions=3? In theory you can go even higher in nFolds, if you see things improving (up to your subject number, which is leave one out CV).

Since Boston Naming did not show a dramatic difference in CVcorrelation, there seem to be something to do with the WRAT scores, or the weak relationship with lesions, which makes the optimization wonder around more in search on what to do with sparseness.

Fatimah-A commented 4 years ago

Both of my tests crashed today! They have been running for ~ 2 days but didn't finish. I'll have to re-run them again. I'll try the nFolds=10, cvRepetitions=3 with this run and report back to you once it's done.

Should I use Boston Naming for this run instead of WRAT?

Fatimah-A commented 4 years ago

Just to clarify, so LESYMAP results should vary and the consistent results we are getting locally aren't reliable? Then what should we use for our analyses? Do we just run it multiple times and take an average map? The maps produced aren't 100% identical, you may have additional clusters in some runs and null findings on other runs.

For some of my tests (COWA & Complex Figure Test), they produced strong results locally but a map with zero voxels when ran on HPC despite the small p-values.

COWA HPC: Subjects: 729 patchinfo: not computed Range statistic: 0 0 Suprathreshold voxels: 0 optimalSparseness: 0.236141291243375 CVcorrelation.stat: 0.491140073368936 CVcorrelation.pval: 1.58628677797537e-45

COWA Local: Subjects: 729 patchinfo: not computed Range statistic: -1 0 Suprathreshold voxels: 9508 Range rawWeights: -6.25421598670073e-05 0 Number of Clusters: 2 Cluster sizes (voxels): 6569 2939 Cluster sizes (mm3): 6569 2939 sccan.eig2: 0.0641700833118756 sccan.ccasummary: 0.46799367843235 sccan.behavior.scaleval: 13.3999941841002 sccan.behavior.centerval: 33.8285322359396 optimalSparseness: 0.131308230375284 CVcorrelation.stat: 0.4895113769876 CVcorrelation.pval: 3.41787556776849e-45

Complex Figure HPC: Subjects: 727 patchinfo: not computed Range statistic: 0 0 Suprathreshold voxels: 0 optimalSparseness: 0.0174127794547691 CVcorrelation.stat: 0.201468083141322 CVcorrelation.pval: 4.27490822519255e-08

Complex Figure local: Subjects: 727 patchinfo: not computed Range statistic: -1 0 Suprathreshold voxels: 440 Range rawWeights: -0.000841519562527537 2.2849098968436e-05 Number of Clusters: 2 Cluster sizes (voxels): 278 162 Cluster sizes (mm3): 278 162 sccan.eig2: 0.0644180813570621 sccan.ccasummary: 0.260212866534197 sccan.behavior.scaleval: 5.59237921525445 sccan.behavior.centerval: 29.6554332874828 optimalSparseness: -0.0194606878316287 CVcorrelation.stat: 0.230670327775946 CVcorrelation.pval: 3.09551957379597e-10

dorianps commented 4 years ago

The results may have tiny variability, but should not be grossly unstable. When they change so much it makes me think that one of the two sides (local or HPC) is not doing something right, either too much stability or too much variation is indicative of a problem. The optimization routine I created was based on simulated (or real) data with good strong signal. It uses a penalty function to fight sparseness being searched at higher numbers. That is because if you use cross-validation, there is a tendency for the system to use as many voxels as possible in an attempt to predict well the unseen data. The parameters I chose seemed to be optimal given the data I tested it with. But it may be that for data that do not have good signal, that search for optimal sparseness might look for a maxima into a "flat" (or ondulated) curve. The curve I am talking about is not empirically derivated or well assessed with various scenarios. The best would be to do a full empirical search of 50 sparseness values on each side (positive and negative) and plot curves on how that search looks for various real behavioral scores. This would need also a search for different nfolds and cvrepetitions to investigate the stability of the findings. Nothing of these empirical evaluation is done yet. This is why I suspect from your reports here that good signal behavior (i.e. Boston Naming) is more stable while bad signal behavior may be more unstable. This may not be a problem usually, because bad signal results in null map, and in that case you don't care if the "best" sparseness was 0.05 or 0.9. But as the signal is worse and the number of subject increase, the same CVcorrelation is more likely to get closer to significance, and it may matter where the final sparseness ends up being. This is why, for now I think the best way for you is to explore ways to get more stable results between HPC runs. Averaging SCCAN runs is not a proper solution because it is not based on a solid statistical rationale. These issues need more empirical research, which has been a need for other multivariate methods, too. It took years for the community to pinpoint the need for multiple comparison in SVR-LSM, and people still use it with a radial basis non-linear function despite some experts saying you can't backproject those SVR weights to map voxels in the brain.

Fatimah-A commented 4 years ago

Hi Dorian,

Here is what I got when I ran my test (WRAT) with nFolds=10, cvRepetitions=3 on HPC: Trial 1: Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 10 sparsenessPenalty: 0.03 optim tolerance: 0.03 Checking sparseness -0.212 . . . CV correlation 0.153 (0.460) (cost=0.854) Checking sparseness 0.212 . . . CV correlation 0.0881 (0.320) (cost=0.918) Checking sparseness -0.475 . . . CV correlation 0.131 (0.467) (cost=0.883) Checking sparseness -0.197 . . . CV correlation 0.154 (0.463) (cost=0.852) Checking sparseness -0.041 . . . CV correlation 0.134 (0.428) (cost=0.867) Checking sparseness -0.162 . . . CV correlation 0.162 (0.466) (cost=0.843) Checking sparseness -0.116 . . . CV correlation 0.162 (0.453) (cost=0.841) Checking sparseness -0.131 . . . CV correlation 0.161 (0.462) (cost=0.843) Checking sparseness -0.087 . . . CV correlation 0.157 (0.449) (cost=0.845) Checking sparseness -0.105 . . . CV correlation 0.157 (0.451) (cost=0.846) Checking sparseness -0.116 . . . CV correlation 0.162 (0.453) (cost=0.841) Found optimal sparsenes -0.116 (CV corr=0.162 p=0.0511) WARNING: Poor cross-validated accuracy, returning NULL result.

Trial 2: Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 10 sparsenessPenalty: 0.03 optim tolerance: 0.03 Checking sparseness -0.212 . . . CV correlation 0.152 (0.457) (cost=0.854) Checking sparseness 0.212 . . . CV correlation 0.0511 (0.307) (cost=0.955) Checking sparseness -0.475 . . . CV correlation 0.126 (0.465) (cost=0.889) Checking sparseness -0.222 . . . CV correlation 0.148 (0.456) (cost=0.859) Checking sparseness -0.05 . . . CV correlation 0.150 (0.441) (cost=0.851) Checking sparseness -0.127 . . . CV correlation 0.168 (0.449) (cost=0.836) Checking sparseness -0.137 . . . CV correlation 0.158 (0.452) (cost=0.846) Checking sparseness -0.096 . . . CV correlation 0.166 (0.445) (cost=0.837) Checking sparseness -0.115 . . . CV correlation 0.159 (0.448) (cost=0.844) Checking sparseness -0.127 . . . CV correlation 0.168 (0.449) (cost=0.836) Found optimal sparsenes -0.127 (CV corr=0.168 p=0.044) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.127 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Thanks!

dorianps commented 4 years ago

Very good, looks like the outcome is more stable with more nFolds. Now, of course, you have different outcomes, one null the other producing a map, but that is another story. If you want to get even more stable results and decide with more certainty which one to believe, increase cvReptitions to 10 and wait a few days to see 2+ HPC results. I imagine that will produce even more stability.

The alpha = 0.05 threshold is widely accepted, but is not a magic value. If you suspect there is something in the data at a more lenient threshold, which you think is worth telling to the community, you can still run sccan at p=0.1. The only thing that matters is to be honest in the publication and tell how you discovered the borderline signal and why you want to report it. If you decide to report exploratory p=0.1, it might be a good idea to check univariate with that threshold, too. Maybe this is just unsolicited advice, I guess the message here is simply that stat choices are yours to make. This is clearly an analysis with weak signal. Once you achieve the stability with sccan, our discussion thread is done here. If you find something useful for other researchers to know (i.e., worth increasing nFolds or cvRepetitions), please note in the publication or here, so others can learn from your experience/conclusions.

Fatimah-A commented 4 years ago

Hi Dorian,

I tired increasing the cvReptitions to 10 and ran it 6 different times but it kept on crashing before getting any results. I ran it with cvRepititions of 6 and nFolds of 10. It took about a day to run with one test completed and the other one crashed when it was close to be done. Here is what I got:

WRAT-Trial 1 (HPC): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 6 nFolds: 10 sparsenessPenalty: 0.03 optim tolerance: 0.03 11:42:58 Checking sparseness -0.212 . . . . . . CV correlation 0.182 (0.455) (cost=0.824) 14:15:38 Checking sparseness 0.212 . . . . . . CV correlation 0.0452 (0.306) (cost=0.961) 15:17:29 Checking sparseness -0.475 . . . . . . CV correlation 0.162 (0.464) (cost=0.852) 17:48:17 Checking sparseness -0.258 . . . . . . CV correlation 0.179 (0.459) (cost=0.828) 20:07:11 Checking sparseness -0.05 . . . . . . CV correlation 0.163 (0.447) (cost=0.838) 22:26:11 Checking sparseness -0.181 . . . . . . CV correlation 0.183 (0.459) (cost=0.823) 00:44:08 Checking sparseness -0.171 . . . . . . CV correlation 0.180 (0.458) (cost=0.825) 03:05:23 Checking sparseness -0.191 . . . . . . CV correlation 0.183 (0.455) (cost=0.823) 05:26:06 Checking sparseness -0.201 . . . . . . CV correlation 0.186 (0.456) (cost=0.820) 07:46:16 Checking sparseness -0.201 . . . . . . CV correlation 0.186 (0.456) (cost=0.820) Found optimal sparsenes -0.201 (CV corr=0.186 p=0.0252) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.201 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

WRAT-Trial 2 (HPC): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 6 nFolds: 10 sparsenessPenalty: 0.03 optim tolerance: 0.03 11:43:18 Checking sparseness -0.212 . . . . . . CV correlation 0.135 (0.451) (cost=0.872) 13:57:14 Checking sparseness 0.212 . . . . . . CV correlation 0.0496 (0.313) (cost=0.957) 14:52:50 Checking sparseness -0.475 . . . . . . CV correlation 0.117 (0.463) (cost=0.897) 17:10:39 Checking sparseness -0.232 . . . . . . CV correlation 0.133 (0.450) (cost=0.874) 19:23:48 Checking sparseness -0.05 . . . . . . CV correlation 0.124 (0.443) (cost=0.877) 21:32:57 Checking sparseness -0.152 . . . . . . CV correlation 0.142 (0.450) (cost=0.862) 23:44:37 Checking sparseness -0.113 . . . . . . CV correlation 0.142 (0.447) (cost=0.862) 01:55:27 Checking sparseness -0.13 . . . . . . CV correlation 0.139 (0.451) (cost=0.865) 04:06:37 Checking sparseness -0.089 . . . . . . CV correlation 0.154 (0.445) (cost=0.848) 06:16:46 Checking sparseness -0.074 . . . . . . CV correlation 0.145 (0.439) (cost=0.857) 08:26:30 Checking sparseness -0.099 . . . . . . CV correlation 0.145 (0.446) (cost=0.858) 10:37:00 Checking sparseness -0.089 . . . .Connection to argon-itf-bx47-25.hpc closed by remote host. Connection to argon-itf-bx47-25.hpc closed.

It doesn't seem ideal to increase the number of repetitions as it takes forever to finish or it may crash half way through. I think I may be able to set the sparseness at -0.089 for the run that crashed but with the fact that our HPC gives different results, then this won't be useful.

I tried to use Boston Naming test which has a very robust finding as maybe a better test for the variability that we are seeing between HPC and local runs. I got another problem with the findings being null despite the small p-value when cvRepetitions is set to 3 (3 different runs gave null results). Do you have an idea on why this is happening? Increasing the cvRepetitions to 6 gave pretty much identical results for 2 runs (I would say the maps are ~98-99% identical). Which is from my understanding this is how it should be? However, these 2 maps are ~ 93-95% similar to the map produced locally. If I would choose based on my preference, I would go with the local map as it's cleaner & more focal with 3 clusters as opposed to 5 on HPC. How do I determine which one is the valid map to use?

Here are my Boston Naming results for your reference (N=620):

Boston Naming, cvRepetitions =3, Trial 1 (HPC): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 08:29:13 Checking sparseness -0.212 . . . CV correlation 0.502 (0.383) (cost=0.504) 09:46:10 Checking sparseness 0.212 . . . CV correlation 0.502 (0.379) (cost=0.505) 10:53:29 Checking sparseness -0.475 . . . CV correlation 0.515 (0.378) (cost=0.499) 11:48:22 Checking sparseness -0.637 . . . CV correlation 0.523 (0.371) (cost=0.497) 12:45:01 Checking sparseness -0.738 . . . CV correlation 0.524 (0.368) (cost=0.498) 13:42:03 Checking sparseness -0.622 . . . CV correlation 0.522 (0.371) (cost=0.496) 14:38:00 Checking sparseness -0.566 . . . CV correlation 0.522 (0.373) (cost=0.495) 15:34:40 Checking sparseness -0.531 . . . CV correlation 0.519 (0.374) (cost=0.497) 16:32:00 Checking sparseness -0.587 . . . CV correlation 0.522 (0.372) (cost=0.495) 17:26:45 Checking sparseness -0.556 . . . CV correlation 0.520 (0.373) (cost=0.496) 18:23:43 Checking sparseness -0.576 . . . CV correlation 0.522 (0.373) (cost=0.495) 19:19:55 Checking sparseness -0.566 . . . CV correlation 0.522 (0.373) (cost=0.495) Found optimal sparsenes -0.566 (CV corr=0.522 p=1.3e-44) WARNING: Poor cross-validated accuracy, returning NULL result.

Boston Naming, cvRepetitions =3, Trial 2 (HPC): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 08:29:16 Checking sparseness -0.212 . . . CV correlation 0.506 (0.387) (cost=0.501) 09:43:39 Checking sparseness 0.212 . . . CV correlation 0.509 (0.379) (cost=0.497) 10:50:14 Checking sparseness 0.475 . . . CV correlation 0.538 (0.347) (cost=0.476) 11:40:54 Checking sparseness 0.637 . . . CV correlation 0.538 (0.342) (cost=0.481) 12:40:06 Checking sparseness 0.502 . . . CV correlation 0.539 (0.347) (cost=0.476) 13:30:52 Checking sparseness 0.457 . . . CV correlation 0.538 (0.347) (cost=0.475) 14:21:11 Checking sparseness 0.364 . . . CV correlation 0.523 (0.372) (cost=0.487) 15:11:13 Checking sparseness 0.422 . . . CV correlation 0.537 (0.361) (cost=0.476) 16:01:12 Checking sparseness 0.446 . . . CV correlation 0.539 (0.349) (cost=0.475) 16:51:58 Checking sparseness 0.436 . . . CV correlation 0.537 (0.354) (cost=0.476) 17:41:27 Checking sparseness 0.446 . . . CV correlation 0.539 (0.349) (cost=0.475) Found optimal sparsenes 0.446 (CV corr=0.539 p=6.14e-48) WARNING: Poor cross-validated accuracy, returning NULL result.

Boston Naming, cvRepetitions =6, Trial 1 (HPC): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 6 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 09:34:52 Checking sparseness -0.212 . . . . . . CV correlation 0.515 (0.382) (cost=0.491) 11:31:45 Checking sparseness 0.212 . . . . . . CV correlation 0.515 (0.376) (cost=0.492) 13:22:06 Checking sparseness -0.475 . . . . . . CV correlation 0.526 (0.376) (cost=0.488) 14:54:06 Checking sparseness -0.637 . . . . . . CV correlation 0.531 (0.370) (cost=0.488) 16:24:09 Checking sparseness -0.587 . . . . . . CV correlation 0.530 (0.372) (cost=0.487) 17:55:19 Checking sparseness -0.563 . . . . . . CV correlation 0.530 (0.373) (cost=0.487) 19:26:58 Checking sparseness -0.553 . . . . . . CV correlation 0.529 (0.373) (cost=0.487) 20:58:42 Checking sparseness -0.573 . . . . . . CV correlation 0.530 (0.373) (cost=0.487) 22:30:31 Checking sparseness -0.573 . . . . . . CV correlation 0.530 (0.373) (cost=0.487) Found optimal sparsenes -0.573 (CV corr=0.53 p=3.37e-46) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.573 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Boston Naming, cvRepetitions =6, Trial 2 (HPC): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 6 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 09:35:39 Checking sparseness -0.212 . . . . . . CV correlation 0.505 (0.379) (cost=0.501) 11:17:00 Checking sparseness 0.212 . . . . . . CV correlation 0.505 (0.378) (cost=0.501) 12:53:28 Checking sparseness -0.475 . . . . . . CV correlation 0.523 (0.375) (cost=0.492) 14:22:00 Checking sparseness -0.637 . . . . . . CV correlation 0.529 (0.369) (cost=0.490) 15:44:55 Checking sparseness -0.616 . . . . . . CV correlation 0.529 (0.370) (cost=0.490) 17:07:06 Checking sparseness -0.578 . . . . . . CV correlation 0.528 (0.372) (cost=0.490) 18:30:22 Checking sparseness -0.538 . . . . . . CV correlation 0.526 (0.373) (cost=0.490) 19:54:48 Checking sparseness -0.568 . . . . . . CV correlation 0.527 (0.372) (cost=0.490) 21:17:35 Checking sparseness -0.588 . . . . . . CV correlation 0.528 (0.371) (cost=0.490) 22:39:38 Checking sparseness -0.578 . . . . . . CV correlation 0.528 (0.372) (cost=0.490) Found optimal sparsenes -0.578 (CV corr=0.528 p=9.27e-46) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.578 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Boston Naming, cvRepetitions =3, Trials 1, 2 and 3 (Local): Searching for optimal sparseness: lower/upper bound: -0.9 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 15:29:08 Checking sparseness -0.212 . . . CV correlation 0.512 (0.386) (cost=0.494) 17:22:26 Checking sparseness 0.212 . . . CV correlation 0.510 (0.377) (cost=0.496) 19:10:50 Checking sparseness -0.475 . . . CV correlation 0.525 (0.375) (cost=0.490) 20:37:31 Checking sparseness -0.637 . . . CV correlation 0.528 (0.370) (cost=0.491) 22:01:46 Checking sparseness -0.488 . . . CV correlation 0.524 (0.375) (cost=0.491) 23:27:20 Checking sparseness -0.375 . . . CV correlation 0.523 (0.381) (cost=0.488) 01:03:25 Checking sparseness -0.411 . . . CV correlation 0.523 (0.378) (cost=0.490) 02:36:02 Checking sparseness -0.313 . . . CV correlation 0.520 (0.383) (cost=0.489) 04:15:20 Checking sparseness -0.385 . . . CV correlation 0.523 (0.379) (cost=0.489) 05:50:52 Checking sparseness -0.35 . . . CV correlation 0.522 (0.381) (cost=0.488) 07:28:28 Checking sparseness -0.365 . . . CV correlation 0.523 (0.381) (cost=0.488) 09:05:48 Checking sparseness -0.365 . . . CV correlation 0.523 (0.381) (cost=0.488) Found optimal sparsenes -0.365 (CV corr=0.523 p=7.39e-45) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.365 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

dorianps commented 4 years ago

Looks like WRAT is still not doing ok with substantial instability. BNT also seems to have flipped to positive sparseness once. The null result you got, I have no idea why did that happen. Make sure you are using the latest lesymap, just in case. Then input set the sparseness manually to 0.446 to see if you can reproduce the error. If you can reproduce it, that is probably a bug.

About the instability, the BNT results tell me maybe the strategy of search at once for positive and negative sparseness may be an issue. I have explained that potential problem in the Wiki. Can you please run BNT four times, two identical trials with these two sparseness searches:

Search positive sparseness 0.005 to 0.9
Search negative sparseness -0.9 to -0.005 The aim will be to see if searching separately for positive or negative will mitigate the early suboptimal choice. If you see above, the optimization routing tries only once on the positive side, and if it finds a lower CVcorrelation than the corresponding negative side, it switched back to negative and keeps looking there.

I should have more time next week to look deeper into these issues.

P.s. The crash you see may be coming from reaching the runtime/memory limit in the HPC.

Fatimah-A commented 4 years ago

Thank you so much Dorian! I do have the latest version (here is my version copied from the info.txt file): LesymapVersion: 0.0.0.9221

I have the tests running now, I'll report back when they are done.

Yes, it looks like I was reaching my runtime on HPC, will work on getting that fixed, thanks!

Fatimah-A commented 4 years ago

Hi Dorian,

Here is what I got for my runs:

BNT - Sparseness set to 0.446: Validating sparseness: cvRepetitions: 3 nFolds: 4 12:01:40 Checking sparseness 0.446 . . . CV correlation 0.528 (0.352) (cost=0.485) Validated sparseness 0.446 (CV corr=0.528 p=6.89e-46) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: 0.446 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: FALSE validateSparseness: TRUE

Info.txt,, for some reason when the sparseness is set to a certain number, it doesn't save the r- and p- values in the text file: sparseness: 0.446 Subjects: 620 patchinfo: not computed Range statistic: -1 0 Suprathreshold voxels: 77598 Range rawWeights: -7.46522391636972e-06 0 Number of Clusters: 2 Cluster sizes (voxels): 68274 9324 Cluster sizes (mm3): 68274 9324 sccan.eig2: 0.0697336813405535 sccan.ccasummary: 0.341933004007707 sccan.behavior.scaleval: 11.8075417249454 sccan.behavior.centerval: 51.0193548387097

BNT - positive sparseness 0.005 to 0.9/Run 1: Searching for optimal sparseness: lower/upper bound: 0.005 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:06:34 Checking sparseness 0.347 . . . CV correlation 0.517 (0.371) (cost=0.494) 12:47:09 Checking sparseness 0.558 . . . CV correlation 0.535 (0.343) (cost=0.482) 13:28:52 Checking sparseness 0.689 . . . CV correlation 0.534 (0.342) (cost=0.487) 14:19:54 Checking sparseness 0.548 . . . CV correlation 0.535 (0.343) (cost=0.481) 15:01:16 Checking sparseness 0.471 . . . CV correlation 0.534 (0.346) (cost=0.480) 15:41:44 Checking sparseness 0.497 . . . CV correlation 0.534 (0.344) (cost=0.481) 16:23:28 Checking sparseness 0.424 . . . CV correlation 0.526 (0.360) (cost=0.487) 17:04:08 Checking sparseness 0.453 . . . CV correlation 0.534 (0.348) (cost=0.480) 17:44:49 Checking sparseness 0.442 . . . CV correlation 0.533 (0.350) (cost=0.480) 18:25:11 Checking sparseness 0.453 . . . CV correlation 0.534 (0.348) (cost=0.480) Found optimal sparsenes 0.453 (CV corr=0.534 p=5.98e-47) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: 0.453 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Info.txt: lowerSparseness: 0.005 upperSparseness: 0.9 Subjects: 620 patchinfo: not computed Range statistic: -1 0 Suprathreshold voxels: 77632 Range rawWeights: -7.45273837310378e-06 0 Number of Clusters: 2 Cluster sizes (voxels): 68308 9324 Cluster sizes (mm3): 68308 9324 sccan.eig2: 0.0697336813405536 sccan.ccasummary: 0.341872382827324 sccan.behavior.scaleval: 11.8075417249454 sccan.behavior.centerval: 51.0193548387097 optimalSparseness: 0.453108493989432 CVcorrelation.stat: 0.533759473118074 CVcorrelation.pval: 5.97961102407295e-47

BNT - positive sparseness 0.005 to 0.9/Run 2: lower/upper bound: 0.005 / 0.9 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:07:36 Checking sparseness 0.347 . . . CV correlation 0.517 (0.373) (cost=0.493) 12:55:26 Checking sparseness 0.558 . . . CV correlation 0.536 (0.345) (cost=0.480) 13:44:36 Checking sparseness 0.689 . . . CV correlation 0.535 (0.343) (cost=0.485) 14:46:37 Checking sparseness 0.548 . . . CV correlation 0.536 (0.345) (cost=0.480) 15:35:44 Checking sparseness 0.471 . . . CV correlation 0.536 (0.350) (cost=0.479) 16:23:24 Checking sparseness 0.424 . . . CV correlation 0.529 (0.365) (cost=0.484) 17:10:55 Checking sparseness 0.501 . . . CV correlation 0.537 (0.347) (cost=0.478) 17:58:38 Checking sparseness 0.491 . . . CV correlation 0.537 (0.348) (cost=0.477) 18:46:49 Checking sparseness 0.491 . . . CV correlation 0.537 (0.348) (cost=0.477) Found optimal sparsenes 0.491 (CV corr=0.537 p=1.1e-47) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: 0.491 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Info.txt: lowerSparseness: 0.005 upperSparseness: 0.9 Subjects: 620 patchinfo: not computed Range statistic: -1 0 Suprathreshold voxels: 80936 Range rawWeights: -6.33512445347151e-06 0 Number of Clusters: 3 Cluster sizes (voxels): 71232 9325 379 Cluster sizes (mm3): 71232 9325 379 sccan.eig2: 0.0697336813405537 sccan.ccasummary: 0.337781064429359 sccan.behavior.scaleval: 11.8075417249454 sccan.behavior.centerval: 51.0193548387097 optimalSparseness: 0.490624445341938 CVcorrelation.stat: 0.537384427980795 CVcorrelation.pval: 1.10438055798893e-47

BNT - negative sparseness -0.9 to -0.005/Run 1: Searching for optimal sparseness: lower/upper bound: -0.9 / -0.005 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:10:17 Checking sparseness -0.558 . . . CV correlation 0.515 (0.371) (cost=0.501) 13:05:05 Checking sparseness -0.347 . . . CV correlation 0.512 (0.375) (cost=0.499) 14:09:59 Checking sparseness -0.216 . . . CV correlation 0.498 (0.376) (cost=0.509) 15:21:24 Checking sparseness -0.428 . . . CV correlation 0.516 (0.374) (cost=0.497) 16:20:58 Checking sparseness -0.418 . . . CV correlation 0.515 (0.373) (cost=0.498) 17:20:30 Checking sparseness -0.478 . . . CV correlation 0.514 (0.373) (cost=0.501) 18:18:03 Checking sparseness -0.447 . . . CV correlation 0.516 (0.374) (cost=0.497) 19:16:46 Checking sparseness -0.428 . . . CV correlation 0.516 (0.374) (cost=0.497) Found optimal sparsenes -0.428 (CV corr=0.516 p=1.9e-43) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.428 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Info.txt: lowerSparseness: -0.9 upperSparseness: -0.005 Subjects: 620 patchinfo: not computed Range statistic: -1 0 Suprathreshold voxels: 34095 Range rawWeights: -2.33296559599694e-05 4.05595301344874e-06 Number of Clusters: 2 Cluster sizes (voxels): 27452 6643 Cluster sizes (mm3): 27452 6643 sccan.eig2: 0.0697336813405536 sccan.ccasummary: 0.363411816702033 sccan.behavior.scaleval: 11.8075417249454 sccan.behavior.centerval: 51.0193548387097 optimalSparseness: -0.428456165429979 CVcorrelation.stat: 0.515817206752373 CVcorrelation.pval: 1.9038115591633e-43

BNT - negative sparseness -0.9 to -0.005/Run 2: lower/upper bound: -0.9 / -0.005 cvRepetitions: 3 nFolds: 4 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:11:03 Checking sparseness -0.558 . . . CV correlation 0.522 (0.372) (cost=0.495) 13:09:50 Checking sparseness -0.347 . . . CV correlation 0.511 (0.379) (cost=0.500) 14:18:01 Checking sparseness -0.689 . . . CV correlation 0.525 (0.367) (cost=0.496) 15:13:09 Checking sparseness -0.577 . . . CV correlation 0.523 (0.371) (cost=0.494) 16:11:02 Checking sparseness -0.61 . . . CV correlation 0.524 (0.370) (cost=0.494) 17:07:09 Checking sparseness -0.59 . . . CV correlation 0.523 (0.371) (cost=0.494) 18:04:47 Checking sparseness -0.6 . . . CV correlation 0.524 (0.370) (cost=0.494) 19:01:58 Checking sparseness -0.6 . . . CV correlation 0.524 (0.370) (cost=0.494) Found optimal sparsenes -0.6 (CV corr=0.524 p=5.58e-45) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.6 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Info.txt: lowerSparseness: -0.9 upperSparseness: -0.005 Subjects: 620 patchinfo: not computed Range statistic: -1 0.38 Suprathreshold voxels: 61514 Range rawWeights: -1.22264154924778e-05 4.60865703644231e-06 Number of Clusters: 5 Cluster sizes (voxels): 48974 9037 1644 1424 435 Cluster sizes (mm3): 48974 9037 1644 1424 435 sccan.eig2: 0.0697336813405537 sccan.ccasummary: 0.360029964646957 sccan.behavior.scaleval: 11.8075417249454 sccan.behavior.centerval: 51.0193548387097 optimalSparseness: -0.599815574318216 CVcorrelation.stat: 0.523802735585345 CVcorrelation.pval: 5.57703580587344e-45

All runs were significant and produced some results. The maps aren't dramatically different but some are cleaner than others with # of clusters ranging from 2-5. The search of negative sparseness in first run produced noticeably smaller (cleaner & more focal) map than the second run with the same settings.

dorianps commented 4 years ago

Thank you for trying this @Fatimah-A . The results show that BNT is more stable and has higher CVcorrelation with positive sparseness search. This means that the previous BNT searches you showed me, which converged frequently on the negative side, are not the most optimal ones, you should use the positive results.

May I suggest you try the same for WRAT now, maybe with nFolds=6 cvRepetitions=4. In principle, the sparseness that produces the best CVcorrelation is the one to be used. The optimization I created seemed to work well with searching negative and positive sparseness on synthetic data, but the reality seems more complex on real scores, so it's better to search the two sides separately for now.

Fatimah-A commented 4 years ago

Sure, I'll run that and report back to you once it's done!

Thanks!

Fatimah-A commented 4 years ago

Hi Dorian,

Here are my WRAT results (3 null and one significant):

WRAT - positive sparseness 0.005 to 0.9/Run 1: Searching for optimal sparseness: lower/upper bound: 0.005 / 0.9 cvRepetitions: 4 nFolds: 6 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:02:09 Checking sparseness 0.347 . . . . CV correlation 0.0703 (0.298) (cost=0.940) 12:20:57 Checking sparseness 0.558 . . . . CV correlation 0.0756 (0.269) (cost=0.941) 12:51:23 Checking sparseness 0.216 . . . . CV correlation 0.0616 (0.320) (cost=0.945) 13:13:08 Checking sparseness 0.433 . . . . CV correlation 0.0743 (0.282) (cost=0.939) 13:37:38 Checking sparseness 0.443 . . . . CV correlation 0.0741 (0.282) (cost=0.939) 14:02:29 Checking sparseness 0.402 . . . . CV correlation 0.0735 (0.289) (cost=0.939) 14:23:33 Checking sparseness 0.381 . . . . CV correlation 0.0732 (0.292) (cost=0.938) 14:42:32 Checking sparseness 0.368 . . . . CV correlation 0.0716 (0.294) (cost=0.939) 15:01:20 Checking sparseness 0.391 . . . . CV correlation 0.0729 (0.291) (cost=0.939) 15:20:47 Checking sparseness 0.381 . . . . CV correlation 0.0732 (0.292) (cost=0.938) Found optimal sparsenes 0.381 (CV corr=0.073 p=0.382) WARNING: Poor cross-validated accuracy, returning NULL result.

Info.txt nFolds: 6 cvRepetitions: 4 lowerSparseness: 0.005 upperSparseness: 0.9 Subjects: 145 patchinfo: not computed Range statistic: 0 0 Suprathreshold voxels: 0 optimalSparseness: 0.380950504694875 CVcorrelation.stat: 0.0732027600636482 CVcorrelation.pval: 0.381561163114657

WRAT - positive sparseness 0.005 to 0.9/Run 2: Searching for optimal sparseness: lower/upper bound: 0.005 / 0.9 cvRepetitions: 4 nFolds: 6 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:03:01 Checking sparseness 0.347 . . . . CV correlation 0.0381 (0.298) (cost=0.972) 12:27:01 Checking sparseness 0.558 . . . . CV correlation 0.0355 (0.270) (cost=0.981) 13:05:07 Checking sparseness 0.216 . . . . CV correlation 0.049 (0.315) (cost=0.958) 13:30:24 Checking sparseness 0.136 . . . . CV correlation 0.0551 (0.322) (cost=0.949) 14:00:13 Checking sparseness 0.086 . . . . CV correlation 0.0624 (0.325) (cost=0.940) 14:29:22 Checking sparseness 0.055 . . . . CV correlation 0.0563 (0.322) (cost=0.945) 15:05:45 Checking sparseness 0.096 . . . . CV correlation 0.0639 (0.324) (cost=0.939) 15:36:39 Checking sparseness 0.111 . . . . CV correlation 0.0584 (0.324) (cost=0.945) 16:06:18 Checking sparseness 0.096 . . . . CV correlation 0.0639 (0.324) (cost=0.939) Found optimal sparsenes 0.096 (CV corr=0.064 p=0.445) WARNING: Poor cross-validated accuracy, returning NULL result.

Info.txt nFolds: 6 cvRepetitions: 4 lowerSparseness: 0.005 upperSparseness: 0.9 Subjects: 145 patchinfo: not computed Range statistic: 0 0 Suprathreshold voxels: 0 optimalSparseness: 0.0957021009328403 CVcorrelation.stat: 0.0638832539116553 CVcorrelation.pval: 0.445237713938646

WRAT - negative sparseness -0.9 to -0.005/Run 1: Searching for optimal sparseness: lower/upper bound: -0.9 / -0.005 cvRepetitions: 4 nFolds: 6 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:03:36 Checking sparseness -0.558 . . . . CV correlation 0.0877 (0.461) (cost=0.929) 13:05:12 Checking sparseness -0.347 . . . . CV correlation 0.0836 (0.460) (cost=0.927) 14:05:16 Checking sparseness -0.216 . . . . CV correlation 0.108 (0.468) (cost=0.899) 15:04:52 Checking sparseness -0.136 . . . . CV correlation 0.118 (0.460) (cost=0.887) 16:04:07 Checking sparseness -0.086 . . . . CV correlation 0.114 (0.454) (cost=0.889) 17:03:15 Checking sparseness -0.125 . . . . CV correlation 0.117 (0.458) (cost=0.887) 18:02:43 Checking sparseness -0.166 . . . . CV correlation 0.117 (0.461) (cost=0.888) 19:02:10 Checking sparseness -0.146 . . . . CV correlation 0.117 (0.460) (cost=0.887) 20:01:40 Checking sparseness -0.136 . . . . CV correlation 0.118 (0.460) (cost=0.887) Found optimal sparsenes -0.136 (CV corr=0.118 p=0.159) WARNING: Poor cross-validated accuracy, returning NULL result.

Info.txt nFolds: 6 cvRepetitions: 4 lowerSparseness: -0.9 upperSparseness: -0.005 Subjects: 145 patchinfo: not computed Range statistic: 0 0 Suprathreshold voxels: 0 optimalSparseness: -0.135578740206532 CVcorrelation.stat: 0.117546958951183 CVcorrelation.pval: 0.159104865095695

WRAT - negative sparseness -0.9 to -0.005/Run 2: Searching for optimal sparseness: lower/upper bound: -0.9 / -0.005 cvRepetitions: 4 nFolds: 6 sparsenessPenalty: 0.03 optim tolerance: 0.03 12:04:29 Checking sparseness -0.558 . . . . CV correlation 0.168 (0.479) (cost=0.849) 13:11:27 Checking sparseness -0.347 . . . . CV correlation 0.177 (0.470) (cost=0.833) 14:17:10 Checking sparseness -0.216 . . . . CV correlation 0.184 (0.470) (cost=0.823) 15:22:23 Checking sparseness -0.136 . . . . CV correlation 0.191 (0.455) (cost=0.813) 16:27:56 Checking sparseness -0.086 . . . . CV correlation 0.193 (0.451) (cost=0.810) 17:32:13 Checking sparseness -0.055 . . . . CV correlation 0.185 (0.445) (cost=0.817) 18:37:04 Checking sparseness -0.101 . . . . CV correlation 0.196 (0.450) (cost=0.807) 19:41:18 Checking sparseness -0.111 . . . . CV correlation 0.196 (0.452) (cost=0.807) 20:40:23 Checking sparseness -0.121 . . . . CV correlation 0.195 (0.454) (cost=0.808) 21:31:43 Checking sparseness -0.111 . . . . CV correlation 0.196 (0.452) (cost=0.807) Found optimal sparsenes -0.111 (CV corr=0.196 p=0.0181) Calling SCCAN with: Components: 1 Use ranks: 1 Sparseness: -0.111 Cluster threshold: 150 Smooth sigma: 0.4 Iterations: 20 maxBased: FALSE directionalSCCAN: TRUE optimizeSparseness: TRUE validateSparseness: FALSE

Info.txt nFolds: 6 cvRepetitions: 4 lowerSparseness: -0.9 upperSparseness: -0.005 Subjects: 145 patchinfo: not computed Range statistic: -1 0.71 Suprathreshold voxels: 3092 Range rawWeights: -0.000614426215179265 0.000433238747064024 Number of Clusters: 5 Cluster sizes (voxels): 1346 648 568 314 216 Cluster sizes (mm3): 1346 648 568 314 216 sccan.eig2: 0.143938623444244 sccan.ccasummary: 0.415332928581459 sccan.behavior.scaleval: 12.6517153138672 sccan.behavior.centerval: 99.2275862068965 optimalSparseness: -0.111142890761625 CVcorrelation.stat: 0.196113837140819 CVcorrelation.pval: 0.0180744726310242

dorianps commented 4 years ago

Thanks @Fatimah-A

Still doesn't look good. I have the WRAT scores from a post above. Can you send me the lesion maps and the BNT scores, too, so I can run a couple of tests?

I don't have time for extensive tests, but will need to check the stability of a couple of processing steps.

Fatimah-A commented 4 years ago

Sure, I sent you an email with a download link.

Please let me know if you haven't received it or if there is anything else you want me to run.

Thanks a lot!

Aaron-Boes commented 3 years ago

Hi Dorian,

I hope this email finds you well. I wanted to follow-up on this open GitHub question that Fatimah posted that we had corresponded about back in June.

Did you ever get a chance to evaluate the stability of lesymap with WRAT or Boston Naming Task performance? Any additional feedback would be much appreciated.

Thanks, Aaron

On Jun 1, 2020, at 9:48 PM, dorianps notifications@github.com<mailto:notifications@github.com> wrote:

Thanks @Fatimah-Ahttps://github.com/Fatimah-A

Still doesn't look good. I have the WRAT scores from a post above. Can you send me the lesion maps and the BNT scores, too, so I can run a couple of tests?

I don't have time for extensive tests, but will need to check the stability of a couple of processing steps.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/dorianps/LESYMAP/issues/23#issuecomment-637237304, or unsubscribehttps://github.com/notifications/unsubscribe-auth/APUVAEWCS766D23POSMOQNTRURSAHANCNFSM4NFIF37A.

Notice: This UI Health Care e-mail (including attachments) is covered by the Electronic Communications Privacy Act, 18 U.S.C. 2510-2521 and is intended only for the use of the individual or entity to which it is addressed, and may contain information that is privileged, confidential, and exempt from disclosure under applicable law. If you are not the intended recipient, any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify the sender immediately and delete or destroy all copies of the original message and attachments thereto. Email sent to or from UI Health Care may be retained as required by law or regulation. Thank you.

dorianps commented 3 years ago

Aaron, apologies but I have not worked on it yet. Between the regular work duties and the limited working-from-home time this has fallen lower in the priorities. It definitely needs to be checked better, and it has been in the back of my mind for quite some time, but it also needs a few dedicated days to find out what is going on.

There are two possibilities I foresee might be the reason for instabilities:

The SCCAN algorithm is not 100% reproducible.
The (semi) random allocation of subjects in cross-validation folds is less random than it should be.

I was hoping @stnava can help with ruling out the first option, pinging him again here.

stnava commented 3 years ago

I don't recall any random number generators that don't have a seed.

dorianps commented 3 years ago

@stnava I don't see any place to set the seed in sparseDecom2.R or sccan.cxx. I did a quick grep on sccan.cxx and looks like there is a class imported for random generation here and uses of randoms here and here. Is there a way to set the random seed that I am missing?

stnava commented 3 years ago

permutation is supposed to be random. if you want to add debugging of permutation, you'd have to do something about it in code. one way is just implement your own permutation wrapper if you don't like the randomization used here.

stnava commented 3 years ago

Anyway, seeds are set internally to zero.

dorianps commented 3 years ago

Well, I am not using permutations in lesymap, so if random generation is used only for that, it should not be the issue. Plus with a fixed seed of 0, all random generation should be identical.

Thanks @stnava .

dorianps commented 3 years ago

@Aaron-Boes just an fyi that this issue has not been out of my mind. I still don't have the time to dedicate for a thorough investigation. I hope this has not caused too much uncertainty in your research conclusions. The differences in p values were not huge if I remember correctly.

I will keep this still in my mind and get to it as soon as I can.

dorianps / LESYMAP

LESYMAP results stable when run locally, vary on compute cluster #23