[ENH] Implement pure-Python ICA with mixture modeling

tsalo commented 4 years ago

Closes #1.

TODO:

[x] Incorporate license and attribution info for nipy's GGM.
[x] Convert pseudocode to real code (this mostly means supporting images in addition to arrays and getting computefeats2 working).
[ ] Compare mixture modeling results to MELODIC.
[ ] Add tests.
[x] Integrate changes into workflow.

Changes proposed in this pull request:

Add GGM class from nipy for mixture modeling.
Use tedana for PCA + ICA.

tsalo commented 4 years ago

The results are very different. For one thing, the MELODIC outputs include both positive and negative values, while the GGM outputs only include positive values. The GGM from nipy cannot place negative values in the Gamma distribution, which is understandable because the Gamma is a positive distribution.

Here is an example using the data in our current (at least in this branch) tests folder:

MELODIC

Python

tsalo commented 4 years ago

It's probable that MELODIC does additional processing on the maps before thresholding them, but it's also possible that the core mixture model procedure is different. I honestly don't know how to figure out either problem, but having a pure Python ICA approach that satisfies the AROMA classifier is crucial for non-fMRIPrep users (e.g., folks who want to supplement their tedana classification with AROMA).

TL;DR: I'm stumped.

eurunuela commented 3 years ago

It's probable that MELODIC does additional processing on the maps before thresholding them, but it's also possible that the core mixture model procedure is different. I honestly don't know how to figure out either problem, but having a pure Python ICA approach that satisfies the AROMA classifier is crucial for non-fMRIPrep users (e.g., folks who want to supplement their tedana classification with AROMA).

I think we should have a hackathon and look at the small details in MELODIC! I'm currently busy but I might find some time for this in the near future.

TL;DR: I'm stumped.

We'll figure it out.

eurunuela commented 3 years ago

Hi @tsalo . I've just been looking through the MELODIC code. Here's a summary of what MELODIC is doing:

So, for each of the IC maps, it's doing:

for(int ctr=1; ctr <= melodat.get_IC().Nrows(); ctr++){
    MelGMix mixmod(opts, logger);

    message("  IC map " << ctr << " ... "<< endl;);

    Matrix ICmap;
    if(melodat.get_stdNoisei().Storage()>0)
        dbgmsg(" stdNoisei max : "<< melodat.get_stdNoisei().Maximum() <<" "<< melodat.get_stdNoisei().Minimum() << endl);

If the data is normalized to std of 1, it adds the standard deviation back.

    if(opts.varnorm.value()&&melodat.get_stdNoisei().Storage()>0){
      ICmap = SP(melodat.get_IC().Row(ctr),diagvals(ctr)*melodat.get_stdNoisei());
    }else{
      ICmap = melodat.get_IC().Row(ctr);
    }
    string wherelog;
    if(opts.genreport.value())
      wherelog = report.getDir();
    else
      wherelog = logger.getDir();

It calculates the mixture-model fit with a GGM and it saved the probability map and the model fit.


        dbgmsg(" ICmap max : "<< mean(ICmap,2).AsScalar() << endl);
    mixmod.setup( ICmap,
          wherelog,ctr,
          melodat.get_mask(), 
          melodat.get_mean(),3);
    message("   calculating mixture-model fit "<<endl);
    mixmod.fit("GGM");

    if(opts.output_MMstats.value()){
      message("   saving probability map:  ");
      melodat.save4D(mixmod.get_probmap(),
             string("stats/probmap_")+num2str(ctr));
      message("   saving mixture model fit:");
      melodat.saveascii(mixmod.get_params(),
             string("stats/MMstats_")+num2str(ctr));  
    }

It z-scores the spatial maps.

    //re-scale spatial maps to mean 0 for nht
    if(opts.rescale_nht.value()){
      message("   re-scaling spatial maps ... "<< endl); 
      RowVector tmp;
      tmp = mixmod.get_means();
      float dmean  = tmp(1);
      tmp = mixmod.get_vars();
      float dstdev = sqrt(tmp(1));

      tmp = (mixmod.get_means() - dmean)/dstdev;
      mixmod.set_means(tmp);
      tmp = (mixmod.get_vars() / (dstdev*dstdev));
      mixmod.set_vars(tmp);

      //tmp = (mixmod.get_data() - dmean)/dstdev;
      tmp = (ICmap - dmean)/dstdev;
      mixmod.set_data(tmp);
      //if(opts.varnorm.value()&&melodat.get_stdNoisei().Storage()>0)
      //    tmp = SP(tmp,pow(diagvals(ctr)*melodat.get_stdNoisei(),-1));

      melodat.set_IC(ctr,tmp);
    }

It perfmors smoothing on the probability maps.

    if(opts.smooth_probmap.value()<0.0){
      message("   smoothing probability map ... "<< endl);
      mixmod.smooth_probs(0.5*(std::min(std::min(std::abs(melodat.get_mean().xdim()),std::abs(melodat.get_mean().ydim())),std::abs(melodat.get_mean().zdim()))));  
    }

    if(opts.smooth_probmap.value()>0.0){
      message("   smoothing probability map ... "<< endl);
      mixmod.smooth_probs(opts.smooth_probmap.value());
    }

It thresholds and flips the map if there are more negative values than positive ones.

    message("   thresholding ... "<< endl);
    mixmod.threshold(opts.mmthresh.value());  

    Matrix tmp;
    tmp=(mixmod.get_threshmaps().Row(1));
    float posint = SP(tmp,gt(tmp,zeros(1,tmp.Ncols()))).Sum();
    float negint = -SP(tmp,lt(tmp,zeros(1,tmp.Ncols()))).Sum();

    if((posint<0.01)&&(negint<0.01)){
      mixmod.clear_infstr();
      mixmod.threshold("0.05n "+opts.mmthresh.value());
      posint = SP(tmp,gt(tmp,zeros(1,tmp.Ncols()))).Sum();
      negint = -SP(tmp,lt(tmp,zeros(1,tmp.Ncols()))).Sum();
    }
    if(negint>posint){//flip map
      //  melodat.flipres(ctr);
      //  mixmod.flipres(ctr);
    }

It saved the results.

    //save mixture model stats 
    if(opts.output_MMstats.value()){
      stats << " IC " << num2str(ctr) << " " << mixmod.get_type() << endl
        << " Means :  " << mixmod.get_means() << endl
        << " Vars. :  " << mixmod.get_vars()  << endl
        << " Prop. :  " << mixmod.get_pi()    << endl << endl;
      message("   saving thresholded Z-stats image:");
      melodat.save4D(mixmod.get_threshmaps(),
             string("stats/thresh_zstat")+num2str(ctr));
    }

    //save mmpars
    // mmpars((ctr-1)*5+1,1) = ctr;
        //     if(mixmod.get_type()=="GGM")
        //       mmpars((ctr-1)*5+1,2) = 1.0;
        //     else
        //       mmpars((ctr-1)*5+1,2) = 0.0;
        //     mmpars((ctr-1)*5+1,2) = mixmod.get_means().Ncols();
        //     tmp =  mixmod.get_means();
        //     for(int ctr2=1;ctr2<=mixmod.get_means().Ncols();ctr2++)
        //       mmpars((ctr-1)*5+2,ctr2) = tmp(1,ctr2);
        //     tmp =  mixmod.get_vars();
        //     for(int ctr2=1;ctr2<=mixmod.get_vars().Ncols();ctr2++)
        //       mmpars((ctr-1)*5+3,ctr2) = tmp(1,ctr2);
        //     tmp =  mixmod.get_pi(); 
        //     for(int ctr2=1;ctr2<=mixmod.get_pi().Ncols();ctr2++)
        //       mmpars((ctr-1)*5+4,ctr2) = tmp(1,ctr2);
        //     mmpars((ctr-1)*5+5,1) = mixmod.get_offset();

It creates the report.

    if( bool(opts.genreport.value()) ){
      message("   creating report page ... ");
      report.IC_rep(mixmod,ctr,melodat.get_IC().Nrows(),melodat.get_ICstats());     
      message("   done" << endl);
    }
  }

eurunuela commented 3 years ago

It's probable that MELODIC does additional processing on the maps before thresholding them, but it's also possible that the core mixture model procedure is different.

So, to answer to this: yes, MELODIC is smoothing before thresholding. Here's the actual smoothing that it performs:

  template <class T>
  volume<T> smooth(const volume<T>& source, float sigma_mm)
    {
      float sigmax, sigmay, sigmaz;
      sigmax = sigma_mm/source.xdim();
      sigmay = sigma_mm/source.ydim();
      sigmaz = sigma_mm/source.zdim();
      int nx=((int) (sigmax-0.001))*2 + 3;
      int ny=((int) (sigmay-0.001))*2 + 3;
      int nz=((int) (sigmaz-0.001))*2 + 3;
      ColumnVector kernelx, kernely, kernelz;
      kernelx = gaussian_kernel1D(sigmax,nx);
      kernely = gaussian_kernel1D(sigmay,ny);
      kernelz = gaussian_kernel1D(sigmaz,nz);
      return convolve_separable(source,kernelx,kernely,kernelz);
    }

As you can see, it is just a Gaussian kernel applied to each of the dimensions.

tsalo commented 2 years ago

Per #54, I'm going to resolve the conflicts in this PR and then switch the target branch to a new one, so that @eurunuela can work on it directly.

tsalo commented 2 years ago

Changes from the merge:

Remove the MNI registration function.
Include some hardcoded variables (e.g., run_ica) just to deal with problems in the merge, since we dropped the ICA step in #48.
Include mapca and tedana as dependencies. Hopefully there's a way around that.

tsalo commented 2 years ago

Merging into mixture now. @eurunuela, you can open a new PR from mixture to main when you're ready.

ME-ICA / aroma

[ENH] Implement pure-Python ICA with mixture modeling #22

MELODIC

Python