mims-harvard / TFC-pretraining

Self-supervised contrastive learning for time series via time-frequency consistency
https://zitniklab.hms.harvard.edu/projects/TF-C/
MIT License
422 stars 78 forks source link

[BUG] Error preprocesing files #27

Open otavioon opened 1 year ago

otavioon commented 1 year ago

Hello.

DESCRIBE THE BUG

I'm trying to reproduce the results of the paper and I've downloaded the datasets using the download_datasets.sh script and preprocessed them usig the process_datasets.sh script. However, I encountered two erros during the preprocessing phase.

  1. Below, the error output that happens for both, CLOCS.py and Mixing-up.py python scripts, called from process_datasets.sh:
Traceback (most recent call last):
  File "/workspaces/hiaac-m4/TFC-pretraining/data_processing/Mixing-up.py", line 8, in <module>
    train_dict = torch.load(os.path.join('datasets', dataset_name, 'train.pt'))
  File "/home/vscode/.local/lib/python3.10/site-packages/torch/serialization.py", line 988, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/home/vscode/.local/lib/python3.10/site-packages/torch/serialization.py", line 437, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/home/vscode/.local/lib/python3.10/site-packages/torch/serialization.py", line 418, in __init__
    super().__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: 'datasets/ecg/train.pt'
  1. Below, the error output that happens when exeuting the SimCLR.py from process_datasets.sh script:
Traceback (most recent call last):
  File "/workspaces/hiaac-m4/TFC-pretraining/data_processing/SimCLR.py", line 79, in <module>
    scatter_numpy(train_y, 1, np.expand_dims(train_dict['labels'].numpy().astype(int), axis=1), 1)
  File "/workspaces/hiaac-m4/TFC-pretraining/data_processing/SimCLR.py", line 41, in scatter_numpy
    idx = [[*np.indices(idx_xsection_shape).reshape(index.ndim - 1, -1),
  File "/workspaces/hiaac-m4/TFC-pretraining/data_processing/SimCLR.py", line 42, in <listcomp>
    index[make_slice(index, dim, i)].reshape(1, -1)[0]] for i in range(index.shape[dim])]
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

NOTE: The first error happens in case-sensitive file-systems, as the datasets are downloaded with upper-case letters and the scripts process them using lower-case letters.

SYSTEM SPECIFICATION