fahrenfort / ADAM

the Amsterdam Decoding And Modeling (ADAM) toolbox
Other
37 stars 15 forks source link

What is the different between results with different numbers of cfg.filenames? #138

Closed amazinger13 closed 3 years ago

amazinger13 commented 3 years ago

Hi Johannes, I ran my data with ADAM toolbox twice. First I had all subjects' filenames defined in the cfg.filenames, and second I defined only one subject. The results of the same subject turned out to be quite different. Why should this happen? Thanks.

fahrenfort commented 3 years ago

Dear amazinger13, sorry but you have to describe more clearly what your problem is. Any reanalysis will yield a slightly different result at the level of single subjects due to a different randomization of folds (if you do a k-fold analysis). This is normal, although at a group level these differences should be minimal. Naturally there will also be differences between a single subject analysis and a group analysis, but I assume that's not what you mean. Without a much clearer description of your problem I can't help you. Cheers, Johannes

amazinger13 commented 3 years ago

Dear amazinger13, sorry but you have to describe more clearly what your problem is. Any reanalysis will yield a slightly different result at the level of single subjects due to a different randomization of folds (if you do a k-fold analysis). This is normal, although at a group level these differences should be minimal. Naturally there will also be differences between a single subject analysis and a group analysis, but I assume that's not what you mean. Without a much clearer description of your problem I can't help you. Cheers, Johannes

Thanks Johannes. The randomization might be the reason for the difference. I raised this question because in your paper (Fahrenfort et al., 2018, Frontiers in Neuroscience), I found this description "Another way to reduce computation time is by lowering the number of subjects in cfg.filenames, e.g., from 19 to 10 (another 50% reduction). Both these changes (another one is lowering cfg.nfold) can be made in the first-level script in section 2.9, and will have little effect on the qualitative patterns of single-subject and group-level results, although some effects may not reach significance. " I know that the lowering cfg.nfold will decrease the number of trials in the training sample, and thus decrease the power of the classifier, but I wonder why the decrease of the number of subjects (in cfg.filenames) would also lead to that "some effects may not reach significance."

fahrenfort commented 3 years ago

Hi, that comment was not meant as a general advice to analyze your data quicker. The article you mention is a tutorial. When using that tutorial, students may want to perform the analyses in the tutorial faster, to save some time (because the goal is to get acquainted with the toolbox, not to properly analyze the data). This is why I point out that you can shorten analysis time by lowering the number of folds or the number of subjects in the analysis. So this is not something you should do to analyze your own data. I also point out that some effects may not reach significance when you do that (so that these results will not look the same as in the tutorial). The reason for that is simple: the statistics are group statistics, so if you perform this on fewer subjects, you are less likely to obtain significant p-values. I’m closing this issue. These are not things to discuss on Github, which should be limited to discussion about whether the software itself contains bugs or issues.

On 2 Apr 2021, at 10:26, amazinger13 @.***> wrote:

Dear amazinger13, sorry but you have to describe more clearly what your problem is. Any reanalysis will yield a slightly different result at the level of single subjects due to a different randomization of folds (if you do a k-fold analysis). This is normal, although at a group level these differences should be minimal. Naturally there will also be differences between a single subject analysis and a group analysis, but I assume that's not what you mean. Without a much clearer description of your problem I can't help you. Cheers, Johannes

Thanks Johannes. The randomization might be the reason for the difference. I raised this question because in your paper (Fahrenfort et al., 2018, Frontiers in Neuroscience), I found this description "Another way to reduce computation time is by lowering the number of subjects in cfg.filenames, e.g., from 19 to 10 (another 50% reduction). Both these changes (another one is lowering cfg.nfold) can be made in the first-level script in section 2.9, and will have little effect on the qualitative patterns of single-subject and group-level results, although some effects may not reach significance. " I know that the lowering cfg.nfold will decrease the number of trials in the training sample, and thus decrease the power of the classifier, but I wonder why the decrease of the number of subjects (in cfg.filenames) would also lead to that "some effects may not reach significance."

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fahrenfort/ADAM/issues/138#issuecomment-812411769, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACM6ZIDKEPTKBOJ6ONNUHPLTGV5R5ANCNFSM42HI7JCA.

amazinger13 commented 3 years ago

It's very clear now. Thanks a lot for your time.