Open iotamudelta opened 6 years ago
@iotamudelta
it seems simply following the build command doesn't work even after modify path.
$ ./b.sh
In file included from /home/whchung/pytorch/aten/src/THC/THCTensorMode.cu:1:
/home/whchung/pytorch/aten/src/THC/THC.h:4:10: fatal error: 'THCGeneral.h' file not found
#include "THCGeneral.h"
^~~~~~~~~~~~~~
1 error generated.
should python3 tools/amd_build/build_pytorch_amd.py
be executed prior?
here's what I see:
$ python3 tools/amd_build/build_pytorch_amd.py
error: patch failed: aten/src/THC/generic/THCTensorRandom.cu:504
error: aten/src/THC/generic/THCTensorRandom.cu: patch does not apply
error: patch failed: aten/src/THCUNN/FeatureLPPooling.cu:193
error: aten/src/THCUNN/FeatureLPPooling.cu: patch does not apply
error: patch failed: aten/src/THC/THCDeviceUtils.cuh:52
error: aten/src/THC/THCDeviceUtils.cuh: patch does not apply
error: patch failed: torch/cuda/__init__.py:123
error: torch/cuda/__init__.py: patch does not apply
Traceback (most recent call last):
File "/home/whchung/pytorch/tools/amd_build/pyHIPIFY/hipify-python.py", line 36, in <module>
from enum import Enum
ImportError: No module named enum
since pytorch
has even more dependencies, perhaps it'd be a good idea to publish one docker container so the issue can be reproduced more easily?
As you know Michael is working on a docker. Let's wait till that is there.
In general, it is hard to come up w/ a single command out-of-order without compiling in-order once, but after that single command works to reproduce. E.g., the THGeneral.h
is created in build/caffe2
.
I believe to solve the ImportError: No module named enum
you need to pip install enum34
.
I actually installed enum34
with pip
and pip3
, still seeing the error though. Since the issue is annoying but not blocking, I'll wait for a docker container for now.
Docker Image: @whchung docker image here. Simply run python setup.py install.
The excessive memory usage is due to the temporary storage of the bitcode for the kernels inside the /tmp folder hc-kernel-assemble.
Here's an example of the file create / delete events that are going on in the /tmp directory.
For full reproduction: Checkout pytorch, run
python3 tools/amd_build/build_pytorch_amd.py
and build pytorch.Compile only file that is problematic (on ubuntu 18.04 and w/ my own paths):
Observed behavior: takes more than 10GB memory to compile.