DeepPSP / torch_ecg

Deep learning ECG models implemented using PyTorch
MIT License
161 stars 20 forks source link

AssertionError: db_dir must be specified #9

Open usanzhu opened 1 year ago

usanzhu commented 1 year ago

您好,我在尝试运行cpsc2021的数据集时,出现上述问题,请问该如何解决呢?

wenh06 commented 1 year ago

Was the error raised when using torch_ecg.databases.CPSC2021 or torch_ecg.databases.datasets.CPSC2021Dataset?

Currently when passing empty db_dir, a warning instead of an error should be raised, for example

from torch_ecg.databases import CPSC2021

dr = CPSC2021()

TorchECG-CPSC2021 - INFO - Please wait patiently to let the reader aggregate statistics on the whole dataset... TorchECG-CPSC2021 - INFO - Done in 0.00651 seconds! TorchECG-CPSC2021 - INFO - Please wait several minutes patiently to let the reader list records for each diagnosis... TorchECG-CPSC2021 - INFO - Done in 0.00097 seconds! /home/wenh06/Jupyter/wenhao/workspace/torch_ecg/torch_ecg/databases/base.py:161: RuntimeWarning: db_dir is not specified, using default /home/wenh06/.cache/torch_ecg/data/cpsc2021 as the storage path warnings.warn( /home/wenh06/Jupyter/wenhao/workspace/torch_ecg/torch_ecg/databases/base.py:169: RuntimeWarning: /home/wenh06/.cache/torch_ecg/data/cpsc2021 does not exist. It is now created. Please check if it is set correctly. Or if you may want to download the database into this folder, please use the download() method. warnings.warn(

wenh06 commented 1 year ago

I might have found the problem. One has to set db_dir in the config class instance for torch_ecg.databases.datasets.CPSC2021Dataset, which is None by default.

usanzhu commented 1 year ago

Thank you, I solved this problem. But there are two more problems come up:

One is that :

train_config.main.loss_kw = ED(gamma_pos=0, gamma_neg=1, implementation="deep-psp")

In this line, the function 'ED' is not defined.

Another one comes up when I use the sample-data cpsc2021 you offered in the document, like this:

ValueError: a must be a sequence or an integer, not <class 'set'>

It appears when the code runs to this line: ds_train = CPSC2021(TrainCfg, training=True, task=task, lazy=False)

which is splitting the dataset into test and training sets: 1297 DEFAULTS.RNG_sample( 1298 afp_subjects, round(len(afp_subjects) * _test_ratio / 100) 1299 ).tolist()

I think if I use my own dataset, this problem may occur again, so I would appreciate it if you could help me out with this.

wenh06 commented 1 year ago

ED is for easydict.EasyDict which was previously used as a configuration class, and has already been replaced with torch_ecg.cfg.CFG. I plan to replace torch_ecg.cfg.CFG with dataclass because it still has bugs that are hard to fix, but I do not have ideas on how to do the replacement.

The second error occurs because the first argument of DEFAULTS.RNG_sample can not be a set. This is fixed in de53269. This bug has been fixed in cpsc2021_dataset.py but left untreated in the train_hybrid_cpsc2021 benchmark study.

I should make a plan to update the benchmark studies and corresponding test files.