Open alice890308 opened 1 year ago
Mei,
We will look into this and get back to you shortly.
Chris
From: Mei @.> Sent: Saturday, December 17, 2022 10:59 AM To: utcs-scea/altis @.> Cc: Subscribed @.***> Subject: [utcs-scea/altis] Build Failed on Ubuntu20.04 (Issue #20)
Hi, I'm trying to build Altis on my server and on docker container, but both encounter the same errors. The following descriptions only show a part of the error messages. here https://gist.github.com/alice890308/e4e6172f7d5c5f1e1b88d97ed1ed35e4 contains the complete error messages
Environment
Ubuntu: 20.04 CUDA version: 11.8 Docker image: nvidia/cuda:11.8.0-devel-ubuntu20.04 cmake: 3.16 GPU: nvidia A100, sm number: 80
Error Messages
First I tried to run ./setup.sh and saw the following result
Then I tried to understand the building process, so I checked here https://github.com/utcs-scea/altis/wiki/Build and applied these steps manually. When running cmake -DCMAKE_CUDA_ARCHITECTURES=80 it shows the following message. But I'm not sure if this is important.
The fatal error occurred when running the last make command.
Reproduce Steps
Run nvidia docker image
sudo nvidia-docker run -it nvidia/cuda:11.8.0-devel-ubuntu20.04 /bin/bash
Install git and cmake
apt-get update apt-get install git cmake
Clone this repo
git clone https://github.com/utcs-scea/altis.git
run setup.sh or follow the build steps to build Altis.
Thanks for viewing my issue. Any reply is appreciated
— Reply to this email directly, view it on GitHub https://github.com/utcs-scea/altis/issues/20 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AAJ6DSOQ6FWMP5H37PT45PDWNXWMBANCNFSM6AAAAAATCBDLDE . You are receiving this because you are subscribed to this thread. https://github.com/notifications/beacon/AAJ6DSKHHK6EJBTP45GM22LWNXWMBA5CNFSM6AAAAAATCBDLDGWGG33NNVSW45C7OR4XAZNFJFZXG5LFVJRW63LNMVXHIX3JMTHFS7VLGI.gif Message ID: @. @.> >
@alice890308 if you set VERBOSE=1
before cmake command what does it show? This way we can see the exact build command and what files make
is expecting. Is it possible to get the complete make log? My speculation is some files are not built due to unspecified SM numbers.
@BDHU Hi! It shows the following messages.
root@23cf7bea18ba:/altis/config/cuda_device_attr_gen# make VERBOSE=1
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../Common -m64 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_80,code=compute_80 -o deviceQuery.o -c deviceQuery.cpp
/usr/local/cuda/bin/nvcc -ccbin g++ -m64 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_80,code=compute_80 -o deviceQuery deviceQuery.o
mkdir -p ../../bin/x86_64/linux/release
cp deviceQuery ../../bin/x86_64/linux/release
Is this the complete make log you are looking for? Or would you like me to check anything else?
@alice890308 Apologies for the late reply. Can you go into the build
directory and remove everything inside? Then execute these two commands:
cmake -DCMAKE_CUDA_ARCHITECTURES=$($SCRIPTPATH/config/get_cuda_sm.sh) ..
and
make VERBOSE=1
I've tested your setup with the exact same docker version and encountered no problem. However, I've only tested on SM61. Therefore, I suspect something has changed in the SM80 series. The above command allows us to see which specific make
command causes the failure.
For example, I noticed you failed to build the maxflops
object file. This is the first workload to build right after libAltisCommon.a
is generated. In my setup, the building command is (that's why we need make VERBOSE=1
to show the message):
[ 5%] Building CUDA object src/cuda/level0/maxflops/CMakeFiles/maxflopsLib.dir/MaxFlops.cu.o
cd /workspace/Desktop/altis/build/src/cuda/level0/maxflops && /usr/local/cuda/bin/nvcc -I/workspace/Desktop/altis/src/cuda/common -I/workspace/Desktop/altis/src/cuda/../common -w -gencode arch=compute_61,code=sm_61 -x cu -c /workspace/Desktop/altis/src/cuda/level0/maxflops/MaxFlops.cu -o CMakeFiles/maxflopsLib.dir/MaxFlops.cu.o
This specific line:
cd /workspace/Desktop/altis/build/src/cuda/level0/maxflops && /usr/local/cuda/bin/nvcc -I/workspace/Desktop/altis/src/cuda/common -I/workspace/Desktop/altis/src/cuda/../common -w -gencode arch=compute_61,code=sm_61 -x cu -c /workspace/Desktop/altis/src/cuda/level0/maxflops/MaxFlops.cu -o CMakeFiles/maxflopsLib.dir/MaxFlops.cu.o
is in charge of generating the MaxFlops.cu.o
file. You can simple copy and rerun it to produce the same error without going through all the cmake generation process.
So in your setup, it might look like this:
[ 5%] Building CUDA object src/cuda/level0/maxflops/CMakeFiles/maxflopsLib.dir/MaxFlops.cu.o
cd /workspace/Desktop/altis/build/src/cuda/level0/maxflops && /usr/local/cuda/bin/nvcc -I/workspace/Desktop/altis/src/cuda/common -I/workspace/Desktop/altis/src/cuda/../common -w -gencode arch=compute_80,code=sm_80 -x cu -c /workspace/Desktop/altis/src/cuda/level0/maxflops/MaxFlops.cu -o CMakeFiles/maxflopsLib.dir/MaxFlops.cu.o
I would first watch for any missing flags or parameters. It's very likely CMake failed to generate some commands.
Hi, I'm trying to build Altis on my server and on docker container, but both encounter the same errors. The following descriptions only show a part of the error messages. here contains the complete error messages
Environment
Ubuntu: 20.04 CUDA version: 11.8 Docker image: nvidia/cuda:11.8.0-devel-ubuntu20.04 cmake: 3.16 GPU: nvidia A100, sm number: 80
Error Messages
First I tried to run
./setup.sh
and saw the following resultThen I tried to understand the building process, so I checked here and applied these steps manually. When running
cmake -DCMAKE_CUDA_ARCHITECTURES=80
it shows the following message. But I'm not sure if this is important.The fatal error occurred when running the last
make
command.Reproduce Steps
Run nvidia docker image
Install git and cmake
Clone this repo
run
setup.sh
or follow the build steps to build Altis.Thanks for viewing my issue. Any reply is appreciated