MICA-MNI / BrainStat

A statistics and context decoding toolbox for neuroimaging.
https://brainstat.readthedocs.io
Other
95 stars 22 forks source link

Wrapping surfstat or built in functions in matlab #5

Closed MICA-MNI closed 4 years ago

MICA-MNI commented 4 years ago

have surfstat work with the main object

ReinderVosDeWael commented 4 years ago

I've been going through some of the SurfStat code. If we actually want to expand on its capabilities then I suspect we'll have to do a major rewrite more akin to what the Python folks are doing, rather than wrapping the existing code. The existing code base is quite badly commented; portions of it were completely unintelligible to me without external manuals on the statistical methods SurfStat was performing. Adding new functionality to this seems like asking for hard-to-find bugs to me.

sofievalk commented 4 years ago

Is it not possible to keep it a bit modular?

ReinderVosDeWael commented 4 years ago

Depends, if all you want is surfstat in a new wrapper then yes. If you want new functionality then we need to be able to build on a foundation where we can understand the implementation details and can debug if something doesn't work as anticipated. This looks to be quite impossible with the way surfstat is written.

sofievalk commented 4 years ago

mm, what most 'economic' time-wise and knowledge transfer-wise

Op wo 15 apr. 2020 om 14:37 schreef Reinder Vos de Wael < notifications@github.com>:

Depends, if all you want is surfstat in a new wrapper then yes. If you want new functionality then we need to be able to build on a foundation where we can understand the implementation details and can debug if something doesn't work as anticipated. This looks to be quite impossible with the way surfstat is written.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MICA-MNI/BrainStat/issues/5#issuecomment-614012142, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHRFGNG3L4R76ETAK7KTJDRMWS73ANCNFSM4MH265NQ .

ReinderVosDeWael commented 4 years ago

Programming time wise I'm guessing a rewrite is more efficient. We'll be bug hunting till kingdom come when a weird result pops up based on code that relies heavily on the SurfStat code-base.

How do we feel about using MATLAB toolboxes? They may limit our userbase to those with access to the toolboxes, but on the other hand those looking for an open-source solution can also just go to Python.

MICA-MNI commented 4 years ago

Hi both & many thanks!

This is obviously one important design considerations so its good to reflect on this carefully. While I agree that we can improve on the commenting, many of the SurfStatLinMod, SurfStatT etc steps are equivalent to an introductory text book to generalized linear models so nothing completely out of the box there. The SurfStatP calls stats_threshold, which is indeed heavy but also embedded in gaussian random field theory and Keiths prior work. Based on my own experience with SurfStat, the current vertex wise linear modelling functionality would suffice for 95% of analyses planned so I am not 100% sure this needs to changed/expanded in the first step (and potential users may have experience with SurfStat), so I agree with Sofie that a wrapper may be practical and efficient initially perhaps together with increasing the commenting of SurfStatLinMod etc.

That being said, if another implementation based on built in code can be quickly and accurately done and if this is also more convenient to you Reinder then I won’t stop you. Additional functions can become useful and SurfStat has some limitations at some instances.

Irrespective of this decision, where we have work to do and have space to add original functionality are the context, the PLS/CCA analyses, the object oriented programming, the conversion to python, and putting these all into a unified ‘framework’. A sensible compromise to me could thus be to first set up the overall schema/workflow based on wrapping SurfStat core functions and to keep the potential replacement of this module using a potentially more flexible version based on in house tools on the agenda.

Boris

On Apr 15, 2020, at 8:37 AM, Reinder Vos de Wael notifications@github.com wrote:

Depends, if all you want is surfstat in a new wrapper then yes. If you want new functionality then we need to be able to build on a foundation where we can understand the implementation details and can debug if something doesn't work as anticipated. This looks to be quite impossible with the way surfstat is written.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/MICA-MNI/BrainStat/issues/5#issuecomment-614012142, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2WMEALN53JLOIZJG7UPS3RMWS73ANCNFSM4MH265NQ.

sofievalk commented 4 years ago

Hi,

I agree with Boris, would it be helpful, as a compromise, to red-flag any code/script that is undercommented / in the 'danger zone' (or those that are easy/difficult to replace)

sofie

Op wo 15 apr. 2020 om 16:18 schreef Boris Bernhardt < notifications@github.com>:

Hi both & many thanks!

This is obviously one important design considerations so its good to reflect on this carefully. While I agree that we can improve on the commenting, many of the SurfStatLinMod, SurfStatT etc steps are equivalent to an introductory text book to generalized linear models so nothing completely out of the box there. The SurfStatP calls stats_threshold, which is indeed heavy but also embedded in gaussian random field theory and Keiths prior work. Based on my own experience with SurfStat, the current vertex wise linear modelling functionality would suffice for 95% of analyses planned so I am not 100% sure this needs to changed/expanded in the first step (and potential users may have experience with SurfStat), so I agree with Sofie that a wrapper may be practical and efficient initially perhaps together with increasing the commenting of SurfStatLinMod etc.

That being said, if another implementation based on built in code can be quickly and accurately done and if this is also more convenient to you Reinder then I won’t stop you. Additional functions can become useful and SurfStat has some limitations at some instances.

  • For example, in R we can also specify logit/probit functions etc to easily turn generate logitistic regressions etc - does the in-house code allow for such things?
  • are model comparisons done easily (AIC, BIC based?). In SurfStat, we currently can do SurfStatF to F statistics but I am not aware that its possible to get AIC, BIC etc.
  • Are multivariate hoteling’s t tests with more than 3 variables possible and is the l

Irrespective of this decision, where we have work to do and have space to add original functionality are the context, the PLS/CCA analyses, the object oriented programming, the conversion to python, and putting these all into a unified ‘framework’. A sensible compromise to me could thus be to first set up the overall schema/workflow based on wrapping SurfStat core functions and to keep the potential replacement of this module using a potentially more flexible version based on in house tools on the agenda.

Boris

On Apr 15, 2020, at 8:37 AM, Reinder Vos de Wael < notifications@github.com> wrote:

Depends, if all you want is surfstat in a new wrapper then yes. If you want new functionality then we need to be able to build on a foundation where we can understand the implementation details and can debug if something doesn't work as anticipated. This looks to be quite impossible with the way surfstat is written.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub < https://github.com/MICA-MNI/BrainStat/issues/5#issuecomment-614012142>, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AC2WMEALN53JLOIZJG7UPS3RMWS73ANCNFSM4MH265NQ .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/MICA-MNI/BrainStat/issues/5#issuecomment-614067721, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADHRFGMJNQYKZUHRJZB2R73RMW63BANCNFSM4MH265NQ .

ReinderVosDeWael commented 4 years ago

It seems like none of the standard linear model matlab packages even have the possibility of including a contrast so we may just be stuck with the SurfStat code. This may limit our options in terms of including things like AIC/BIC/multivariate stats unless someone here has extensive expertise in these and could program them, because that seems well beyond my skillset.

MICA-MNI commented 4 years ago

Great, so we have a winner :)

For potential expansions, lets put this on the list of things we’d like to add to the stats inference module at one point. Extensive expertise is a big word but I think more expanded model comparison things (LRT, AIC, BIC ETC) is something that we can potentially add to the toolkit at one point relatively easily given that we can interrogate the model fit from the slm structure (eg via SSE etc)

happy to help

On Apr 15, 2020, at 12:51 PM, Reinder Vos de Wael notifications@github.com wrote:

It seems like none of the standard linear model matlab packages even have the possibility of including a contrast so we may just be stuck with the SurfStat code. This may limit our options in terms of including things like AIC/BIC/multivariate stats unless someone here has extensive expertise in these and could program them, because that seems well beyond my skillset.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/MICA-MNI/BrainStat/issues/5#issuecomment-614154316, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC2WMEEU5FB334IKLLJZCZTRMXQXJANCNFSM4MH265NQ.