Open goelayu opened 3 months ago
I have a similar problem following the build and installation with pip. What is your machine configuration?
cd /content/FlexFlow/ && pip install -r requirements.txt
pip install flexflow
import flexflow.serve as ff
# FlexFlow initialisation
ff.init(
num_gpus=1,
memory_per_gpu=14000,
zero_copy_memory_per_node=8000,
tensor_parallelism_degree=2,
pipeline_parallelism_degree=1,
num_cpus=2,
profiling=True
)
# Specify the LLM
llm = ff.LLM("meta-llama/Llama-2-7b-hf")
# Specify a list of SSMs (just one in this case)
ssms=[]
ssm = ff.SSM("JackFram/llama-68m")
ssms.append(ssm)
# Create the sampling configs
generation_config = ff.GenerationConfig(
do_sample=False, temperature=0.9, topp=0.8, topk=1
)
# Compile the SSMs for inference and load the weights into memory
for ssm in ssms:
ssm.compile(generation_config)
# Compile the LLM for inference and load the weights into memory
llm.compile(generation_config,
max_requests_per_batch = 16,
max_seq_length = 256,
max_tokens_per_batch = 128,
ssms=ssms)
llm.start_server()
result = llm.generate("Here are some travel tips for Tokyo:\n")
#llm.stop_server() # This invocation is optional
Output
[/usr/local/lib/python3.10/dist-packages/flexflow/serve/__init__.py](https://localhost:8080/#) in <module>
15 from typing import Optional
16 from ..type import *
---> 17 from flexflow.core import *
18 from .serve import LLM, SSM, GenerationConfig, GenerationResult
19
[/usr/local/lib/python3.10/dist-packages/flexflow/core/__init__.py](https://localhost:8080/#) in <module>
32 else:
33 # print("Using cffi flexflow bindings.")
---> 34 from .flexflow_cffi import *
35
36 ff_arg_to_sysarg = {
[/usr/local/lib/python3.10/dist-packages/flexflow/core/flexflow_cffi.py](https://localhost:8080/#) in <module>
36 )
37 from flexflow.config import *
---> 38 from .flexflowlib import ffi, flexflow_library
39
40
[/usr/local/lib/python3.10/dist-packages/flexflow/core/flexflowlib.py](https://localhost:8080/#) in <module>
18 from .flexflow_cffi_header import flexflow_header
19
---> 20 from legion_cffi import ffi
21 from distutils import sysconfig
22
ModuleNotFoundError: No module named 'legion_cffi'
@hygorjardim Ubuntu 22.04, Cuda 12.3, GPU compute capability 7.0
@goelayu Here is my output from config.linux, some versions are a little different from yours, I'm using an env in google colab to run. For me it's still very unclear and I still can't help you with this problem with the UCX version...
CUDA_PATH=/usr/local/cuda/lib64/stubs cmake -DCUDA_USE_STATIC_CUDA_RUNTIME=OFF -DLegion_HIJACK_CUDART=OFF -DINFERENCE_TESTS=OFF -DLIBTORCH_PATH=/content/libtorch -DCMAKE_BUILD_TYPE=Release -DFF_CUDA_ARCH=autodetect -DCUDA_PATH=/usr/local/cuda -DCUDNN_PATH=/usr/local/cuda -DFF_HIP_ARCH=all -DFF_USE_PYTHON=ON -DBUILD_LEGION_ONLY=OFF -DFF_USE_NCCL=ON -DNCCL_PATH=/usr/local/cuda -DFF_BUILD_ALL_EXAMPLES=OFF -DFF_BUILD_ALL_INFERENCE_EXAMPLES=ON -DFF_USE_PREBUILT_LEGION=OFF -DFF_USE_PREBUILT_NCCL=OFF -DFF_USE_ALL_PREBUILT_LIBRARIES=OFF -DFF_BUILD_UNIT_TESTS=OFF -DFF_USE_AVX2=OFF -DFF_MAX_DIM=5 -DLEGION_MAX_RETURN_SIZE=262144 -DROCM_PATH=/opt/rocm -DFF_GPU_BACKEND=cuda ../config/../
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- FF_LEGION_NETWORKS:
-- Linux Version: 22.04
-- CPU architecture: x86_64
CMake Warning (dev) at cmake/cuda.cmake:6 (find_package):
Policy CMP0146 is not set: The FindCUDA module is removed. Run "cmake
--help-policy CMP0146" for policy details. Use the cmake_policy command to
set the policy and suppress this warning.
Call Stack (most recent call first):
CMakeLists.txt:157 (include)
This warning is for project developers. Use -Wno-dev to suppress it.
-- Found CUDA: /usr/local/cuda (found version "12.2")
-- No result from nvcc so building for 2.0
-- CUDA Detected CUDA_ARCH : 75
-- CUDA_VERSION: 12.2
-- CUDA root path : /usr/local/cuda
-- CUDA include path : /usr/local/cuda/include
-- CUDA runtime libraries :
-- CUDA driver libraries : /usr/local/cuda/lib64/stubs/libcuda.so
-- CUBLAS libraries : /usr/local/cuda/lib64/libcublas.so
-- CURAND libraries : /usr/local/cuda/lib64/libcurand.so
-- CUDA Arch : 75
-- CUDA_GENCODE: -gencode arch=compute_75,code=sm_75
-- CMAKE_CUDA_COMPILER: /usr/local/cuda/bin/nvcc
-- CUDNN include : /usr/include
-- CUDNN libraries : /usr/lib/x86_64-linux-gnu/libcudnn.so
-- Building Legion from source
-- GASNET ROOT:
-- Found Git: /usr/bin/git (found version "2.34.1")
-- Version string from git: legion-shardrefine-final-142-g24e8c4523
I solved this problem by 3 steps:
pip install . --verbose
found a problem:
ERROR: setuptools==59.5.0 is used in combination with setuptools_scm>=8.x
Your build configuration is incomplete and previously worked by accident!
setuptools_scm requires setuptools>=61
Then I reinstall setuptools : pip install setuptools==61
but found another problem:
File "/tmp/pip-req-build-zu23ayuw/legion_cffi_build.py", line 39, in find_header
raise Exception('Unable to locate header file:' + filename + " in:" + runtime_dir)
Exception: Unable to locate header file:legion.h in:/runtime
I modify the file in ./deps/legion/bindings/python/legion_cffi_build.py
where the root_dir is the root directory of legion.h
root_dir = "/home/server/FlexFlow/deps/legion/"#os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
It can build successfully. Then add the directory of legion_cffi.py to the PYTHONPATH.
echo "export PYTHONPATH=$PYTHONPATH:/usr/lib/python3.8/site-packages/" >> ~/.bashrc
souce ~/.bashrc
I installed FlexFlow using the pip command:
pip install flexflow
While importing it,
import flexflow.serve as ff
, I get the errorModuleNotFoundError: No module named 'legion_cffi'
I doubt this is expected behavior?
I am able to overcome this error by installing flexflow using the source, however then I run into the following error while running the
ff.init
commandI believe I use the latest version of ucx so not sure why I am getting this error. Command used for configuration while building from source:
FF_CUDA_ARCH=70 FF_USE_PYTHON=ON FF_LEGION_NETWORKS=ucx UCX_DIR=/software/ucx-1.15.0/install ./config/config.linux
Any tips on overcoming the no module found error without having to build from source? And ideas on how to avoid the ucx version error?