tencent-ailab / Frequency_Aug_VAE_MoESR

Latent-based SR using MoE and frequency augmented VAE decoder
Apache License 2.0
150 stars 4 forks source link

运行sh inf_moe_8x.sh后报错ModuleNotFoundError: No module named 'basicsr.metrics' #4

Closed 1404971870 closed 1 year ago

1404971870 commented 1 year ago

已经安装了basicsr包,但是运行sh inf_moe_8x.sh报错ModuleNotFoundError: No module named 'basicsr.metrics'。我服上了完整报错信息以及我的环境包目录。 全部报错信息如下: Traceback (most recent call last): File "/data/sunys/program/Frequency_Aug_VAE_MoESR/Frequency_Aug_VAE_MoESR/sr_8x_inf/sr_val_ddim_moe.py", line 18, in from basicsr.utils.imresize import imresize File "/data/sunys/program/Frequency_Aug_VAE_MoESR/Frequency_Aug_VAE_MoESR/sr_8x_inf/basicsr/init.py", line 6, in from .metrics import ModuleNotFoundError: No module named 'basicsr.metrics' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 25366) of binary: /data/sunys/miniconda3/envs/moe_sr/bin/python Traceback (most recent call last): File "/data/sunys/miniconda3/envs/moe_sr/bin/torchrun", line 33, in sys.exit(load_entry_point('torch==1.12.1', 'console_scripts', 'torchrun')()) File "/data/sunys/miniconda3/envs/moe_sr/lib/python3.9/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 345, in wrapper return f(args, **kwargs) File "/data/sunys/miniconda3/envs/moe_sr/lib/python3.9/site-packages/torch/distributed/run.py", line 761, in main run(args) File "/data/sunys/miniconda3/envs/moe_sr/lib/python3.9/site-packages/torch/distributed/run.py", line 752, in run elastic_launch( File "/data/sunys/miniconda3/envs/moe_sr/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 131, in call return launch_agent(self._config, self._entrypoint, list(args)) File "/data/sunys/miniconda3/envs/moe_sr/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 245, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

sr_val_ddim_moe.py FAILED

Failures:

------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2023-10-31_10:46:15 host : 4cd3bee4acc7 rank : 0 (local_rank: 0) exitcode : 1 (pid: 25366) error_file: traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ 包目录如下: absl-py 2.0.0 accelerate 0.23.0 addict 2.4.0 aiohttp 3.8.6 aiosignal 1.3.1 albumentations 1.3.0 altair 5.1.2 antlr4-python3-runtime 4.8 appdirs 1.4.4 async-timeout 4.0.3 attrs 23.1.0 basicsr 1.4.2 blinker 1.6.3 braceexpand 0.1.7 Brotli 1.0.9 cachetools 5.3.2 certifi 2023.7.22 cffi 1.15.1 charset-normalizer 2.0.4 click 8.1.7 contourpy 1.1.1 cryptography 41.0.3 cycler 0.12.1 diffusers 0.21.4 docker-pycreds 0.4.0 einops 0.3.0 filelock 3.13.0 fonttools 4.43.1 frozenlist 1.4.0 fsspec 2023.10.0 ftfy 6.1.1 future 0.18.3 gitdb 4.0.11 GitPython 3.1.40 google-auth 2.23.3 google-auth-oauthlib 1.0.0 grpcio 1.59.0 huggingface-hub 0.17.3 idna 3.4 imageio 2.31.6 imageio-ffmpeg 0.4.2 importlib-metadata 6.8.0 importlib-resources 6.1.0 install 1.3.5 invisible-watermark 0.2.0 Jinja2 3.1.2 joblib 1.3.2 jsonschema 4.19.1 jsonschema-specifications 2023.7.1 kiwisolver 1.4.5 kornia 0.6.0 lazy-loader 0.3 lmdb 1.4.1 Markdown 3.5 markdown-it-py 3.0.0 MarkupSafe 2.1.3 matplotlib 3.8.0 mdurl 0.1.2 mkl-fft 1.3.1 mkl-random 1.2.2 mkl-service 2.4.0 multidict 6.0.4 networkx 3.2.1 numpy 1.23.1 oauthlib 3.2.2 omegaconf 2.1.1 open-clip-torch 2.0.2 opencv-python 4.6.0.66 opencv-python-headless 4.8.1.78 packaging 23.2 pandas 2.1.2 pathtools 0.1.2 Pillow 10.0.1 pip 20.3.3 platformdirs 3.11.0 protobuf 3.20.3 psutil 5.9.6 pyarrow 13.0.0 pyasn1 0.5.0 pyasn1-modules 0.3.0 pycparser 2.21 pydeck 0.8.1b0 pyDeprecate 0.3.1 Pygments 2.16.1 Pympler 1.0.1 pyOpenSSL 23.2.0 pyparsing 3.1.1 PySocks 1.7.1 python-dateutil 2.8.2 pytorch-lightning 1.4.2 pytz 2023.3.post1 PyWavelets 1.4.1 PyYAML 6.0.1 qudida 0.0.4 referencing 0.30.2 regex 2023.10.3 requests 2.31.0 requests-oauthlib 1.3.1 rich 13.6.0 rpds-py 0.10.6 rsa 4.9 safetensors 0.4.0 scikit-image 0.22.0 scikit-learn 1.3.2 scipy 1.11.3 semver 3.0.2 sentry-sdk 1.32.0 setproctitle 1.3.3 setuptools 68.0.0 six 1.16.0 smmap 5.0.1 streamlit 1.12.1 streamlit-drawable-canvas 0.8.0 taming-transformers 0.0.1 /data/sunys/program/Frequency_Aug_VAE_MoESR/Frequency_Aug_VAE_MoESR/src/taming-transformers tb-nightly 2.14.0a20230808 tensorboard 2.15.0 tensorboard-data-server 0.7.2 test-tube 0.7.5 threadpoolctl 3.2.0 tifffile 2023.9.26 tokenizers 0.14.1 toml 0.10.2 tomli 2.0.1 toolz 0.12.0 torch 1.12.1 torchmetrics 0.6.0 torchvision 0.13.1 tornado 6.3.3 tqdm 4.66.1 transformers 4.34.1 triton 2.1.0 typing-extensions 4.7.1 tzdata 2023.3 tzlocal 5.2 urllib3 1.26.18 validators 0.22.0 wandb 0.15.12 watchdog 3.0.0 wcwidth 0.2.8 webdataset 0.2.5 werkzeug 3.0.1 wheel 0.41.2 yapf 0.40.2 yarl 1.9.2 zipp 3.17.0
1404971870 commented 1 year ago

我已经解决了,是因为basicsr缺少了metrics子文件夹

HannahJHan commented 1 year ago

我已经解决了,是因为basicsr缺少了metrics子文件夹

你好,我也遇到这个问题,请问你是从哪里补充的metrics子文件的呢?

viperyl commented 1 year ago

我已经解决了,是因为basicsr缺少了metrics子文件夹

你好,我也遇到这个问题,请问你是从哪里补充的metrics子文件的呢?

got to basicSR repo link https://github.com/XPixelGroup/BasicSR/tree/master, metrics folder is ./basicr/metrics