jameschapman19 / scikit-palm

Implementing permutation methods for multiview learning in python
BSD 3-Clause "New" or "Revised" License
4 stars 0 forks source link

Quickperms Index is out of bounds #2

Closed JohannesWiesner closed 2 years ago

JohannesWiesner commented 2 years ago

Hi James, as you probably know the Human Connectome Project contains a lot of related subjects. For the CCA analysis, I need to have permutation/bootstrap design that respects the grouping structure of the dataset. So thanks for implementing PALM in python! I tried to play around with quickperm in order to reproduce something like FSLs hcp2blocks.m function. I am not exactly sure how I would like to set up the permutation design but I just wanted to play around with it for the beginning. For that I first loaded the restricted .csv file as a pandas dataframe, converted the family ids to a (N,1) array and tried to use quickperm on that. However, I am getting the following error:

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\test_permutation_with_stratification.py", line 24, in <module>
    perms = quickperms(design_matrix=None,exchangeability_blocks=family_ids,perms=10)

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\quickperms.py", line 80, in quickperms
    permutation_tree = tree(exchangeability_blocks, design_matrix)

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\utils\tree.py", line 14, in tree
    permutation_tree[0][0], permutation_tree[0][2] = maketree(

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\utils\tree.py", line 25, in maketree
    B1 = exchangeability_blocks[:, 0]

IndexError: index 0 is out of bounds for axis 1 with size 0

Here's some code to reproduce the error (of course you need to have access to the restricted .csv file, but I guess you have?):

# 1.) load the restricted .csv file as pandas dataframe....

# 2.) convert family ids from human connectome project to integers
family_ids = hcp_df['Family_ID']
family_dict = dict([(y,x+1) for x,y in enumerate(sorted(set(family_ids)))])
family_ids = np.array([family_dict[k] for k in family_ids]).reshape(-1,1)

# 3.) Run quickperms
perms = quickperms(design_matrix=None,exchangeability_blocks=family_ids,perms=10)
jameschapman19 commented 2 years ago

The quick and simple answer is that this module is intended to operate on PALM exchangeability blocks as in:

https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/PALM/ExchangeabilityBlocks#:~:text=If%20there%20are%20restrictions%20on%20exchangeability%2C%20blocks%20of,This%20file%20is%20supplied%20with%20the%20option%20-eb.

Which I obviously need to document here!

jameschapman19 commented 2 years ago

There is a section there specifically on how to get HCP exchangeability blocks i.e. by running the hcp2blocks.m

For HCP it's probably reinventing the wheel to rewrite hcp2blocks in python which is why I didn't.

JohannesWiesner commented 2 years ago

Aha, so one would have to run hcp2blocks.m first on the restricted .csv, then load in the EB.csv as numpy array and pass that to quickperms if I get that right?

jameschapman19 commented 2 years ago

Exactly

JohannesWiesner commented 2 years ago

Alrighty, thank you! :) Feel free to close unless you want to keep it open as doc enhancement :)

JohannesWiesner commented 2 years ago

Hmm, I ran hcp2blocks.m on my specific subset of the HCP (424 subjects). I then read in EB.csv as numpy array and tried to use the following lines:

import pandas as pd
import numpy as np

import sys
sys.path.append('../repos/skpalm')

from skpalm.permutations import quickperms

eb = np.loadtxt('../data/EB.csv',delimiter=',',dtype=int) # which is now a (423,5) numpy integer array
perms = quickperms(design_matrix=None,exchangeability_blocks=eb,perms=10)

but I am getting the following error:

File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\test_skpalm.py", line 22, in <module>
    perms = quickperms(design_matrix=None,exchangeability_blocks=eb,perms=10)

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\quickperms.py", line 80, in quickperms
    permutation_tree = tree(exchangeability_blocks, design_matrix)

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\utils\tree.py", line 14, in tree
    permutation_tree[0][0], permutation_tree[0][2] = maketree(

  File "C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\utils\tree.py", line 40, in maketree
    design_matrix[idx, :],

IndexError: too many indices for array: array is 1-dimensional, but 2 were indexed

Here's how eb looks like (with the last column being the subject ids)

quickperm_bug

jameschapman19 commented 2 years ago

Thanks again for this. Looks like I didn't translate the defaulting of design matrix to 1:N if design matrix is None.

Just pushing a changed version which I think fixes this?

Unrelated: If you're interested I'm happy to make you a collaborator on this repo.

JohannesWiesner commented 2 years ago

Perfect, works now! Only got a RuntimeWarning but everything seems to work alright :)

Number of possible permutations is exp([735.22422275]).

C:\Users\Johannes.Wiesner\Documents\projects\hcp_project\code\../repos/skpalm\skpalm\permutations\utils\shuftree.py:26: RuntimeWarning: overflow encountered in exp
  maxP = np.exp(lmaxP)

You can make me a collaborator, I just cannot guarantee to be of great help (again, not the best statistician here but at least a good bug-hunter and happy help with doc stuff, etc.)

jameschapman19 commented 2 years ago

Ah great news that it works!

I've invited you but without pressure to do anything! Just in case you come across any bugs/want to implement anything else from PALM/have any miscellaneous ideas.

I'd shelved this project until your comments so you've already made a contribution :)