Open pbchekin opened 9 months ago
The problem with this benchmark is that it unconditionally imports torchrec which in its turn unconditionally imports fbgemm. Both libraries seem to exist only for CUDA (especially fbgemm) and aren't supposed to work on any other GPUs.
Torchrec readme suggests to install fbgemm_gpu for CPU usnig command pip install fbgemm-gpu --index-url https://download.pytorch.org/whl/nightly/cpu
but with this version I am getting AttributeError: '_OpNamespace' 'fbgemm' object has no attribute 'jagged_2d_to_dense'
again.
The problem is caused by incompatibility of fbgemm nightly and out pytorch versions. When native code is loaded I see an error /home/jovyan/.conda/envs/triton-no-conda-3.10-stonepia/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_py.so: undefined symbol: _ZNK5torch8autograd4Node4nameEv
and therefore there are no native function definitions.
Ok it is possible to make this benchmark work but it is a considerable effort.
mkdir build
cd build
cmake -DUSE_SANITIZER=address -DFBGEMM_LIBRARY_TYPE=shared -DPYTHON_EXECUTABLE=`which python3` -DFBGEMM_BUILD_DOCS=OFF -DFBGEMM_BUILD_BENCHMARKS=OFF -DCMAKE_INSTALL_PREFIX=${CONDA_PREFIX} ..
make -j
make install
cd ../fbgemm_gpu
export package_name=fbgemm_gpu_cpu
export python_tag=py310
export ARCH=$(uname -m)
export python_plat_name="manylinux2014_${ARCH}"
python setup.py bdist_wheel --package_variant=cpu --package_name="${package_name}" --python-tag="${python_tag}" --plat-name="${python_plat_name}"
python setup.py install --package_variant=cpu
it is possible that C++ library is not required for python, maybe it is enough to build just python fbgemm_gpu.
Bug report on fbgemm_gpu build https://github.com/pytorch/FBGEMM/issues/2362
The issue is still reproducible.