GidLev / cepy

Implementation of the connectome embedding framework
MIT License
15 stars 2 forks source link

sklearn scale was not recognized within the bracket #3

Open pinghongyeh opened 3 months ago

pinghongyeh commented 3 months ago

Hi Gidon,

I am testing the ce_prediction scripts, but it failed in the following

for node_i in tqdm(np.arange(num_nodes)): # loop over all nodes

a boolean vector to remove the diagonal as a feature

exclude_diag = np.ones(num_nodes, dtype=bool)
exclude_diag[node_i] = False

inputs = [x_ce[:,node_i,:], x_cm[:,node_i,exclude_diag], x_sc[:,node_i,exclude_diag]] # select a node
inputs = [scale(x) for x in inputs] # standardize the inputs (mean = 0, std = 1)

for input_i, (input_name, input_x, input_df) in enumerate(zip(input_names, inputs, dfs)): # loop over the three input types
    for ind_kf, (train_index, test_index) in enumerate(kf.split(input_x)): # the 5-fold iterator
        curr_lm = clone(lm)
        curr_lm.fit(input_x[train_index, :], y[train_index])
        y_pred_cv[input_i, test_index, node_i] = curr_lm.predict(input_x[test_index, :])

    # keep the pearson correlation coefficient between the real and predicted age    
    input_df.loc[node_i, 'correlation_coef'], _ = stats.pearsonr(y, y_pred_cv[input_i, :, node_i])

NameError: name 'scale' is not defined

I have run the from matplotlib import pyplot as plt from matplotlib.colors import rgb2hex import numpy as np import seaborn as sns import pandas as pd from scipy import stats from sklearn.metrics.pairwise import cosine_similarity from sklearn.preprocessing import scale from sklearn.model_selection import KFold from sklearn.linear_model import LinearRegression from sklearn.linear_model import SGDRegressor from sklearn.base import clone from sklearn.metrics import explained_variance_score from tqdm import tqdm

It seems that scale(x) does not run within the bracket []

Do you have any suggestion? Thank you. Ping

DsgGidL commented 3 months ago

Hi @pinghongyeh thanks for spotting the issue, here is a quick fix:

from sklearn.preprocessing import scale

inputs = [scale(x) for x in inputs] # standardize the inputs (mean = 0, std = 1)
GidLev commented 3 months ago

https://github.com/GidLev/cepy/commit/b97053c89a6f3017ea79b16f74667d747c55db0e fixed in ce_prediction.ipynb

pinghongyeh commented 3 months ago

Sorry for my novice. I do not see any change. The error still exist.

Input In [94], in (.0) 13 exclude_diag[node_i] = False 15 inputs = [x_ce[:,node_i,:], x_cm[:,node_i,exclude_diag], x_sc[:,node_i,exclude_diag]] # select a node ---> 16 inputs = [scale(x) for x in inputs] # standardize the inputs (mean = 0, std = 1) 18 for input_i, (input_name, input_x, input_df) in enumerate(zip(input_names, inputs, dfs)): # loop over the three input types 19 for ind_kf, (train_index, test_index) in enumerate(kf.split(input_x)): # the 5-fold iterator

NameError: name 'scale' is not defined

Thank you.

DsgGidL commented 3 months ago

The fix was in the imports section, please run this import line before the scaling operation: from sklearn.preprocessing import scale

Let me know if it solved it.

pinghongyeh commented 3 months ago

The problem of "scale" was solved. I have other questions, if you don't mind. In the "ce_subjects_pipeline" scripts, I was not able to find the "NKI_200_schaefer_sc_train_consensus_mat.npy". How was it created? In addition, in the session of the paper, "2.12. Predicting group-level functional from structural connectivity with deep learning", are the scripts regarding the deep learning algorithms described in this session available to the public?

Thank you.

DsgGidL commented 3 months ago

Hi,

The consensus matrix was generated using Betzel et al. (2018) distance-dependent consensus thresholds. I don't think I'll manage to find the file you are referring to but you can generate it using this code: https://github.com/GidLev/consensus-thresholding

Predicting group-level functional from structural connectivity with deep learning, here is the relevant code: https://gist.github.com/GidLev/0a5bb6ddef7df1a49b430ec3957cdc9e

pinghongyeh commented 3 months ago

Hi Gidon, Thank you for providing the codes. For the distance matrix, the required input for the fcn_group_bins, how can it be created using the example data, e.g. NKI_200_schaefer_sc_matrices.npz?

Ping