Closed dgai91 closed 14 hours ago
Chances are you've just converged very quickly due to the low number of KL annealing epochs. Normally, we advise n_kl_annealing_epochs = n_epochs // 2
.
Note, you have a good number of batches, so probably don't need to train for that many epochs to converge. I would try something like n_epochs=50
, n_kl_annealing_epochs=25
and check the loss curve. You should see it increase for the first half (due to the KL annealing) then start decreasing in the second. It should hopefully converge to a value before the 50th epoch.
Thank you Dr cgohil!. I have been trying the new param which you suggested.
In addition, I have some questions about the use of data, if you want to draw the FCN for each state, there is a problem here, that is, the absolute strength of each side of the FCN may be large, not the range of -1 to 1 like static analysis. For this problem, how do you think it is appropriate to deal with, so as to ensure that it is more acceptable in neuroscience.
Finally, regarding normalization of the mixing coefficients, should I understand that when statistical analysis of the mixing coefficients is performed, the normalized mixing coefficients should be used, rather than the original mixing coefficients? In this case, the following question arises: *_when calculating FC at a certain time, should the FC be normalized using the method mentioned in the previous question and then weighted using the normalized alpha, or should the original FC alpha be used directly?_**
In addition, I have some questions about the use of data, if you want to draw the FCN for each state, there is a problem here, that is, the absolute strength of each side of the FCN may be large, not the range of -1 to 1 like static analysis. For this problem, how do you think it is appropriate to deal with, so as to ensure that it is more acceptable in neuroscience.
This is likely due to the network being a covariance matrix rather than correlation, you could use the following function to normalise it into a correlation matrix:
https://github.com/OHBA-analysis/osl-dynamics/blob/main/osl_dynamics/array_ops.py#L64
Finally, regarding normalization of the mixing coefficients, should I understand that when statistical analysis of the mixing coefficients is performed, the normalized mixing coefficients should be used, rather than the original mixing coefficients? In this case, the following question arises: when calculating FC at a certain time, should the FC be normalized using the method mentioned in the previous question and then weighted using the normalized alpha, or should the original FC * alpha be used directly?
Whether to use the renormalised or raw alphas depends on what you're trying to do. If you want to reconstruct a network I think the raw alphas would be more appropriate.
Thank you Dr Cgohil ! your explaination is very clearly. Recently, I encountered another problem when using Dynemo, which is that I cannot reproduce my previous results. Is this because of my random seed setting? However, after researching related materials, I found that for TensorFlow, setting the following random seed can fix the results:
def set_seeds(seed):
tf.random.set_seed(seed)
np.random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
random.seed(seed)
Although I fixed the random seed through the above functions, when I train the model using the following code, the fluctuation of max(mean(alpha)) is extremely large, with the range of the largest alpha ranging from 0.16 to 0.9. When the alpha of a certain state is 0.9, the model is equivalent to being static.
set_seeds(800)
config = Config(
...
)
print("Reading resting-fMRI data")
dataset_dir = "../data/hcp_fmri/ica30/"
training_data = data.Data(dataset_dir, load_memmaps=False)
methods = {"standardize": {}, "tde_pca": {"n_embeddings": 5, "n_pca_components": 30}}
training_data.prepare(methods)
training_dataset = training_data.dataset(
config.sequence_length,
config.batch_size,
shuffle=True,
concatenate=True,
)
prediction_files = dataset_dir
prediction_data = data.Data(
prediction_files,
load_memmaps=False
)
prediction_data.prepare(methods)
prediction_dataset = prediction_data.dataset(
config.sequence_length, config.batch_size, shuffle=False, concatenate=False
)
model = Model(config)
# init_history = model.random_subset_initialization(training_dataset, n_epochs=1, n_init=5, take=0.5)
history = model.fit(training_dataset)
model.save(f"{output_dir}/model")
model = load(f"{output_dir}/model")
a = model.get_alpha(prediction_dataset, concatenate=False, remove_edge_effects=True)
mean_a = np.mean(np.concatenate(a), axis=0)
print("mean_a:", mean_a)
I understand that your research focuses on using EEG data, but I think that under similar loss conditions, the fluctuation in the calculation of alpha should not be so large. I would appreciate it if you could give me some advice. Thank you very much, and once again, I apologize for disturbing you.I will post my complete code below.
import os
import numpy as np
from osl_dynamics import data, inference
from osl_dynamics.inference import tf_ops
from osl_dynamics.models import load
from osl_dynamics.utils import plotting
from osl_dynamics.models.dynemo import Config, Model
import pickle
from osl_dynamics.analysis import modes, power, workbench, connectivity
import tensorflow as tf
import random
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
output_dir = "../dynemo_results"
os.makedirs(output_dir, exist_ok=True)
def set_seeds(seed):
tf.random.set_seed(seed)
np.random.seed(seed)
os.environ['PYTHONHASHSEED'] = str(seed)
random.seed(seed)
set_seeds(800)
config = Config(
n_modes=12,
n_channels=30,
sequence_length=20,
inference_n_units=32,
inference_normalization='batch',
inference_n_layers=2,
model_n_units=8,
model_normalization='batch',
model_n_layers=1,
learn_alpha_temperature=True,
initial_alpha_temperature=1.0,
learn_means=False,
learn_covariances=True,
do_kl_annealing=True,
kl_annealing_curve="tanh",
kl_annealing_sharpness=5,
n_kl_annealing_epochs=25,
batch_size=64,
learning_rate=0.01,
n_epochs=50,
loss_calc='mean'
)
print("Reading resting-fMRI data")
dataset_dir = "../data/hcp_fmri/ica30/"
training_data = data.Data(dataset_dir, load_memmaps=False)
methods = {"standardize": {}, "tde_pca": {"n_embeddings": 5, "n_pca_components": 30}}
# methods = {"standardize": {}, "pca": {"n_pca_components": 20}}
training_data.prepare(methods)
training_dataset = training_data.dataset(
config.sequence_length,
config.batch_size,
shuffle=True,
concatenate=True,
)
prediction_files = dataset_dir
prediction_data = data.Data(
prediction_files,
load_memmaps=False
)
prediction_data.prepare(methods)
prediction_dataset = prediction_data.dataset(
config.sequence_length, config.batch_size, shuffle=False, concatenate=False
)
model = Model(config)
# init_history = model.random_subset_initialization(training_dataset, n_epochs=1, n_init=5, take=0.5)
history = model.fit(training_dataset)
model.save(f"{output_dir}/model")
model = load(f"{output_dir}/model")
a = model.get_alpha(prediction_dataset, concatenate=False, remove_edge_effects=True)
pickle.dump(a, open(f"{output_dir}/alp.pkl", "wb"))
means, covariances = model.get_means_covariances()
print(means.shape, covariances.shape)
np.save(f"{output_dir}/means.npy", means)
np.save(f"{output_dir}/covs.npy", covariances)
# means = np.load(f"{output_dir}/means.npy")
# covariances = np.load(f"{output_dir}/covs.npy")
# a = pickle.load(open(f"{output_dir}/alp.pkl", "rb"))
mean_a = np.mean(np.concatenate(a), axis=0)
print("mean_a:", mean_a)
plotting.plot_alpha(a[0], filename=f"{output_dir}/alpha.png")
tr_covariances = np.trace(covariances, axis1=1, axis2=2)
a_NW = [tr_covariances[np.newaxis, ...] * alp for alp in a]
a_NW = [alp_NW / np.sum(alp_NW, axis=1)[..., np.newaxis] for alp_NW in a_NW]
plotting.plot_alpha(a_NW[0], filename=f"{output_dir}/alpha_nw.png")
power_map = modes.raw_covariances(
covariances,
pca_components=training_data.pca_components,
n_embeddings=training_data.n_embeddings,
)
# Save the power maps
connectivity.save(
power_map,
parcellation_file="../../roi_labels/ica_components/canica_components_30ic.nii.gz",
plot_kwargs={
"edge_cmap": "Reds",
"display_mode": "xz",
"annotate": False
},
filename=f"{output_dir}/connectivity.png",
threshold=0.95
)
print(max(mean_a), 800)
Hi, Dr Cgohil: I'm sorry to bother you again. These days, I have been trying to train dynemo model on HCP data. I have got 2000 fmri timeseries which reconstructed by ICA(30ic, emb=15 pca=60 for tde) components like other paper provided methods. And I tried config loss_calc="mean"
and this is my loss
Finally, the value of the loss function can be reduced to 104(epochs=120), but since I have no experience training VAE, I am not sure if this loss is normal, so I would appreciate some guidance from you.