AI4HealthUOL / SSSD

Repository for the paper: 'Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models'
MIT License
257 stars 45 forks source link

problems encountered in the process of using the HippoSSKernel function of the S4 Module #25

Open Ssw2001 opened 3 weeks ago

Ssw2001 commented 3 weeks ago

hi,Excuse me for disturbing you. When I was running the S4 module, an error occurred when it came to the nplr function of the HippoSSKernel function, as shown below.When I remove this module, everything runs smoothly, so I suspect there might be an issue within the S4 module itself. Is this problem due to incorrect parameter settings or is it incompatible with my PyTorch version 3.7.10? I am very much looking forward to your reply as this is very important to me. Thank you!

CUDA extension for cauchy multiplication not found. Install by going to extensions/cauchy/ and running python setup.py install. This should speed up end-to-end training by 10-50% Falling back on slow Cauchy kernel. Install at least one of pykeops or the CUDA extension for efficiency. {'diffusion_config': {'T': 200, 'beta_0': 0.0001, 'beta_T': 0.02}, 'wavenet_config': {'in_channels': 1, 'out_channels': 1, 'num_res_layers': 36, 'res_channels': 256, 'skip_channels': 256, 'diffusion_step_embed_dim_in': 128, 'diffusion_step_embed_dim_mid': 512, 'diffusion_step_embed_dim_out': 512, 's4_lmax': 1000, 's4_d_state': 64, 's4_dropout': 0.0, 's4_bidirectional': 1, 's4_layernorm': 1, 'label_embed_dim': 128, 'label_embed_classes': 71}, 'train_config': {'output_directory': '/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/sssd_label_cond', 'ckpt_iter': 'max', 'iters_per_ckpt': 400, 'iters_per_logging': 10, 'n_iters': 10000, 'learning_rate': 0.0002, 'batch_size': 6}, 'trainset_config': {'segment_length': 1000, 'sampling_rate': 100, 'finetune_dataset': 'ptbxl_all', 'data_path': '/home/sunsw/sunshiwen/work/SSSD-ECG/src/ptb_xl/processed-data/test_ptbxl_1000.npy'}, 'gen_config': {'output_directory': '/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/sssd_label_cond', 'ckpt_path': '/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/sssd_label_cond/'}} output directory /home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/sssd_label_cond/ch256_T200_betaT0.02_train Traceback (most recent call last): File "mytrain1.py", line 212, in train(train_config) File "mytrain1.py", line 53, in train net = mySSD_ECG1(model_config).cuda() File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/mySSD_ECG1.py", line 258, in init label_embed_dim=label_embed_dim if label_embed_classes > 0 else None) File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/mySSD_ECG1.py", line 207, in init label_embed_dim=label_embed_dim)) File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/mySSD_ECG1.py", line 122, in init layer_norm=s4_layernorm) File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/S4Model.py", line 1190, in init bidirectional=bidirectional) File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/S4Model.py", line 1074, in init self.kernel = HippoSSKernel(self.h, N=self.n, L=l_max, channels=channels, verbose=verbose, **kernelargs) File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/S4Model.py", line 975, in init w, p, B, = nplr(measure, self.N, rank, dtype=dtype) File "/home/sunsw/sunshiwen/work/SSSD-ECG/src/sssd/models/S4Model.py", line 441, in nplr V_inv = V.conj().transpose(-1, -2) IndexError: Dimension out of range (expected to be in range of [-1, 0], but got -2)

juanlopezcode commented 2 weeks ago

Hi, firstly, it seems that you are in an incorrect repository, you are working with SSSD-ECG, not SSSD. Secondly, it seems that you have altered the network as you call now mySSD_ECG1, if that is the case, I'm unable to assist here. In any case, it seems that the problem comes from V parameter in S4 model, following up on what and where made it might help.

Ssw2001 commented 2 weeks ago

Hi, firstly, it seems that you are in an incorrect repository, you are working with SSSD-ECG, not SSSD. Secondly, it seems that you have altered the network as you call now mySSD_ECG1, if that is the case, I'm unable to assist here. In any case, it seems that the problem comes from V parameter in S4 model, following up on what and where made it might help. ; Thank you very much for your answer, it has been of great help to me. After switching to a higher version of PyTorch, I was able to get it running. However, I would like to confirm that after I ran the command python setup.py install for the file /root/SSSD/src/extensions/cauchy/setup.py, a file named cauchy_mult.py appeared, and the content is:

def bootstrap(): global bootstrap, loader, file import sys, pkg_resources, importlib.util file = pkg_resources.resource_filename(name, 'cauchy_mult.cpython-38-x86_64-linux-gnu.so') loader = None; del bootstrap, loader spec = importlib.util.spec_from_file_location(name,file) mod = importlib.util.module_from_spec(spec) spec.loader.exec_module(mod) bootstrap()

Is the installation successful? After placing this file in /root/SSSD/src/extensions/cauchy, it worked for me. Is my approach reasonable?