facebookresearch / param

PArametrized Recommendation and Ai Model benchmark is a repository for development of numerous uBenchmarks as well as end to end nets for evaluation of training and inference platforms.
MIT License
118 stars 61 forks source link

ImportError: cannot import name 'ExecutionTraceObserver' from 'torch.profiler' #85

Closed tangyuelm closed 1 year ago

tangyuelm commented 1 year ago

Dear Authors,

Thank you for the benchmark. I am starting to use this tool to generate ETs for DNN model writing in Pytorch. I have installed and implemented using the following commands. cd train/compute/python nohup python3 setup.py install >setup.out 2>&1 & nohup python3 -m pytorch.run_benchmark -c examples/pytorch/configs/alex_net.json -d cuda --eg --cuda-l2-cache on > bench.out 2>&1 & However, I got the error of ImportError: cannot import name 'ExecutionTraceObserver' from 'torch.profiler' I am wondering if my installation or running has some problems. Would it be possible to provide me with some help? The configuration of Python and Pytorch is as follows. (rppg-toolbox) [tangyue@v001 python]$ python3 -m torch.utils.collect_env Collecting environment information...

PyTorch version: 1.12.1 Is debug build: False CUDA used to build PyTorch: 10.2 ROCM used to build PyTorch: N/A

OS: CentOS Linux release 8.2.2004 (Core) (x86_64) GCC version: (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5) Clang version: 9.0.1 (Red Hat 9.0.1-2.module_el8.2.0+309+0c7b6b03) CMake version: version 3.11.4 Libc version: glibc-2.28

Python version: 3.8.13 (default, Mar 28 2022, 11:38:47) [GCC 7.5.0] (64-bit runtime) Python platform: Linux-4.18.0-193.28.1.el8_2.x86_64-x86_64-with-glibc2.17 Is CUDA available: True CUDA runtime version: Could not collect GPU models and configuration: GPU 0: Tesla V100-SXM2-32GB Nvidia driver version: 525.60.13 cuDNN version: Could not collect HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True

Versions of relevant libraries: [pip3] numpy==1.22.0 [pip3] pytorch3d==0.7.1 [pip3] torch==1.12.1 [pip3] torchaudio==0.12.1 [pip3] torchinfo==1.7.1 [pip3] torchsampler==0.1.2 [pip3] torchsummary==1.5.1 [pip3] torchvision==0.13.1 [conda] blas 1.0 mkl [conda] cudatoolkit 10.2.89 hfd86e86_1 [conda] ffmpeg 4.3 hf484d3e_0 pytorch [conda] mkl 2021.4.0 h06a4308_640 [conda] mkl-service 2.4.0 py38h7f8727e_0 [conda] mkl_fft 1.3.1 py38hd3c417c_0 [conda] mkl_random 1.2.2 py38h51133e4_0 [conda] numpy 1.22.0 pypi_0 pypi [conda] pytorch 1.12.1 py3.8_cuda10.2_cudnn7.6.5_0 pytorch [conda] pytorch-mutex 1.0 cuda pytorch [conda] pytorch3d 0.7.1 py38_cu102_pyt1121 pytorch3d [conda] torchaudio 0.12.1 py38_cu102 pytorch [conda] torchinfo 1.7.1 pypi_0 pypi [conda] torchsampler 0.1.2 pypi_0 pypi [conda] torchsummary 1.5.1 pypi_0 pypi [conda] torchvision 0.13.1 py38_cu102 pytorch [3]+ Exit 1 nohup python3 -m pytorch.run_benchmark -c examples/pytorch/configs/alex_net.json -d cuda --eg --cuda-l2-cache on > bench.out 2>&1 Thank you.

louisfeng commented 1 year ago

Hi @tangyuelm, I think the issue is due to the refactor (rename) of the observer in a recent PR: https://github.com/pytorch/pytorch/pull/99694

I'd suggest use the nightly of PyTorch: conda install pytorch cudatoolkit=11.3 -c pytorch-nightly

So, PyTorch 1.12.1 won't have this change. We can do a better job at a release process that sync with PyTorch versions. Please let me know if you continue to have this issue with the latest PyTorch version.