gregversteeg / bio_corex

A flexible version of CorEx developed for bio-data challenges that handles missing data, continuous/discrete variables, multi-CPU, overlapping structure, and includes visualizations
Apache License 2.0
139 stars 29 forks source link

How to apply CorEx on diffusion MRI data #31

Open rosella1234 opened 8 months ago

rosella1234 commented 8 months ago

Hello, I would like to better understand how to use corEx to model my data. It is about 14 measures from diffusion MRI data. Each measure is 69 (number of subjects) by 2286 (number of voxels). I want to inspect correlations among these measures, which may be related to each other. I have read CorEx papers and looked at the python code, my specific questions are: • How X matrix has to be built in my case? • How the number of hidden factors to use can be chosen? • How dimension of each hidden factor can be chosen? • marginal_description I guess must be 'gaussian' since my data is continuous • smooth_marginals = True (turns on Bayesian smoothing)

Thank you in advance, Rosella

gregversteeg commented 8 months ago
rosella1234 commented 8 months ago

Hi Thank you very much for your quick response! Everything is clear, just a question: I would like to inspect correlations among all the 14 measures altogether and not within the single measure, how should I concatenate my 14 [69 by 2286[ matrices to create X to feed into CorEx? thanks in advance Rosella

rosella1234 commented 5 months ago

Good afternoon, I have tried to run CorEx on each subject at a time, using as input a 14*2286 matrix for each subject, that is all the 14 diffusion measures stacked together and flattened along their 2286 voxels. I was suggested to use: number of hidden factos = 10 and dimension of each hidden factor = 1 to get 1 joint representation per subject. However, when running CorEx this way (see code below) I get all total correlations and all clusters equal to 0. Am I doing something wrong in my implementation? Thanks a lot in advance,

Rosella

# Set bioCorEx parameters
num_hidden_factors = 10
dim_hidden_factor = 1
marginal_description = 'gaussian'
smooth_marginals = True
# Initialize an empty NumPy array to store the output
output_matrix = np.empty((num_subjects, num_hidden_factors))
for i in range(1, num_subjects):
    # Initialize bioCorEx object
    corex = ce.Corex(n_hidden=num_hidden_factors, dim_hidden=dim_hidden_factor, marginal_description =marginal_description, smooth_marginals=smooth_marginals)
    # Fit bioCorEx model
    corex.fit(data_matrix[i,:,:])
    print(corex.tcs)
    print(corex.clusters)
    output_matrix[i, :] = corex.tcs
np.save('output_matrix.npy', output_matrix)