sooftware / kospeech

Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.
https://sooftware.github.io/kospeech/
Apache License 2.0
603 stars 191 forks source link

runtime error 발생 #171

Open yoloo795 opened 1 year ago

yoloo795 commented 1 year ago

런타임 에러가 발생합니다.

torch의 serialization에서

class _open_zipfile_writer_file(_opener): def init(self, name) -> None: super(_open_zipfile_writer_file, self).init(torch._C.PyTorchFileWriter(str(name)))

이 동작을 시행할 때 RuntimeError: Parent directory $path does not exist 가 발생하네요.. $path 경로에 폴더가 있는데 왜 발생하는지 모르겠습니다..

혹시 아시는 분은 해결방법 알려주시면 너무 감사하겠습니다..

[원문] Error executing job with overrides: ['model=ds2', 'train=ds2_train', 'train.dataset_path=C:\Users\shwns\Desktop\Data'] Traceback (most recent call last): File "./bin/main.py", line 163, in main last_model_checkpoint = train(config) File "./bin/main.py", line 122, in train Checkpoint(model, self.optimizer, self.trainset_list, self.validset, epoch).save() File "C:\Users\shwns\Desktop\xxx\xxx\xxx\kospeech-latest\kospeech-latest\bin\kospeech\checkpoint\checkpoint.py", line 81, in save torch.save(trainer_states, os.path.join(os.getcwd(), self.TRAINER_STATE_NAME)) File "C:\Users\shwns\anaconda3\envs\vowing_1125\lib\site-packages\torch\serialization.py", line 424, in save with _open_zipfile_writer(f) as opened_zipfile: File "C:\Users\shwns\anaconda3\envs\vowing_1125\lib\site-packages\torch\serialization.py", line 311, in _open_zipfile_writer return container(name_or_buffer) # C:\Users\shwns\Desktop\xxx\xxx\xxx\kospeech-latest\kospeech-latest\outputs\2022-12-14\12-20-39\trainer_states.pt File "C:\Users\shwns\anaconda3\envs\vowing_1125\lib\site-packages\torch\serialization.py", line 289, in init super(_open_zipfile_writer_file, self).init(torch._C.PyTorchFileWriter(str(name))) RuntimeError: Parent directory C:\Users\shwns\Desktop\xxx\xxx\xxx\kospeech-latest\kospeech-latest\outputs\2022-12-14\13-32-47 does not exist.

XEL-Maker commented 1 year ago

혹시 옵션중에 resume 를 True로 하셨나요? File "./bin/main.py", line 163, in main last_model_checkpoint = train(config) 이 오류는 저도 resume 시도할려고 True로 설정했을때 어떤 명령어 넣어줘야되는지 몰라서 그냥 실행했을때 나온 에러였습니다.