flexflow / FlexFlow

FlexFlow Serve: Low-Latency, High-Performance LLM Serving
https://flexflow.readthedocs.io
Apache License 2.0
1.59k stars 218 forks source link

Error while importing `import flexflow.serve as ff` #1337

Open goelayu opened 3 months ago

goelayu commented 3 months ago

I installed FlexFlow using the pip command: pip install flexflow

While importing it, import flexflow.serve as ff, I get the error ModuleNotFoundError: No module named 'legion_cffi'

I doubt this is expected behavior?

I am able to overcome this error by installing flexflow using the source, however then I run into the following error while running the ff.init command

[0 - 7f59a35cc480]    0.000000 {4}{ucp}: The UCX network module requires UCX 1.14.0 or above
[0 - 7f59a35cc480]    0.000000 {6}{ucp}: internal init failed
[tk-21:2932908] *** Process received signal ***

I believe I use the latest version of ucx so not sure why I am getting this error. Command used for configuration while building from source: FF_CUDA_ARCH=70 FF_USE_PYTHON=ON FF_LEGION_NETWORKS=ucx UCX_DIR=/software/ucx-1.15.0/install ./config/config.linux

Any tips on overcoming the no module found error without having to build from source? And ideas on how to avoid the ucx version error?

hygorjardim commented 3 months ago

I have a similar problem following the build and installation with pip. What is your machine configuration?

cd /content/FlexFlow/ && pip install -r requirements.txt
pip install flexflow
import flexflow.serve as ff

# FlexFlow initialisation
ff.init(
        num_gpus=1,
        memory_per_gpu=14000,
        zero_copy_memory_per_node=8000,
        tensor_parallelism_degree=2,
        pipeline_parallelism_degree=1,
        num_cpus=2,
        profiling=True
    )
# Specify the LLM
llm = ff.LLM("meta-llama/Llama-2-7b-hf")

# Specify a list of SSMs (just one in this case)
ssms=[]
ssm = ff.SSM("JackFram/llama-68m")
ssms.append(ssm)
# Create the sampling configs
generation_config = ff.GenerationConfig(
    do_sample=False, temperature=0.9, topp=0.8, topk=1
)

# Compile the SSMs for inference and load the weights into memory
for ssm in ssms:
    ssm.compile(generation_config)

# Compile the LLM for inference and load the weights into memory
llm.compile(generation_config,
            max_requests_per_batch = 16,
            max_seq_length = 256,
            max_tokens_per_batch = 128,
            ssms=ssms)
llm.start_server()
result = llm.generate("Here are some travel tips for Tokyo:\n")
#llm.stop_server() # This invocation is optional

Output


[/usr/local/lib/python3.10/dist-packages/flexflow/serve/__init__.py](https://localhost:8080/#) in <module>
     15 from typing import Optional
     16 from ..type import *
---> 17 from flexflow.core import *
     18 from .serve import LLM, SSM, GenerationConfig, GenerationResult
     19 

[/usr/local/lib/python3.10/dist-packages/flexflow/core/__init__.py](https://localhost:8080/#) in <module>
     32 else:
     33     # print("Using cffi flexflow bindings.")
---> 34     from .flexflow_cffi import *
     35 
     36 ff_arg_to_sysarg = {

[/usr/local/lib/python3.10/dist-packages/flexflow/core/flexflow_cffi.py](https://localhost:8080/#) in <module>
     36 )
     37 from flexflow.config import *
---> 38 from .flexflowlib import ffi, flexflow_library
     39 
     40 

[/usr/local/lib/python3.10/dist-packages/flexflow/core/flexflowlib.py](https://localhost:8080/#) in <module>
     18 from .flexflow_cffi_header import flexflow_header
     19 
---> 20 from legion_cffi import ffi
     21 from distutils import sysconfig
     22 

ModuleNotFoundError: No module named 'legion_cffi'
goelayu commented 3 months ago

@hygorjardim Ubuntu 22.04, Cuda 12.3, GPU compute capability 7.0

hygorjardim commented 3 months ago

@goelayu Here is my output from config.linux, some versions are a little different from yours, I'm using an env in google colab to run. For me it's still very unclear and I still can't help you with this problem with the UCX version...

CUDA_PATH=/usr/local/cuda/lib64/stubs cmake -DCUDA_USE_STATIC_CUDA_RUNTIME=OFF -DLegion_HIJACK_CUDART=OFF -DINFERENCE_TESTS=OFF -DLIBTORCH_PATH=/content/libtorch -DCMAKE_BUILD_TYPE=Release -DFF_CUDA_ARCH=autodetect -DCUDA_PATH=/usr/local/cuda -DCUDNN_PATH=/usr/local/cuda -DFF_HIP_ARCH=all -DFF_USE_PYTHON=ON -DBUILD_LEGION_ONLY=OFF -DFF_USE_NCCL=ON -DNCCL_PATH=/usr/local/cuda -DFF_BUILD_ALL_EXAMPLES=OFF -DFF_BUILD_ALL_INFERENCE_EXAMPLES=ON -DFF_USE_PREBUILT_LEGION=OFF -DFF_USE_PREBUILT_NCCL=OFF -DFF_USE_ALL_PREBUILT_LIBRARIES=OFF -DFF_BUILD_UNIT_TESTS=OFF -DFF_USE_AVX2=OFF -DFF_MAX_DIM=5 -DLEGION_MAX_RETURN_SIZE=262144 -DROCM_PATH=/opt/rocm -DFF_GPU_BACKEND=cuda ../config/../
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- FF_LEGION_NETWORKS: 
-- Linux Version: 22.04
-- CPU architecture: x86_64
CMake Warning (dev) at cmake/cuda.cmake:6 (find_package):
  Policy CMP0146 is not set: The FindCUDA module is removed.  Run "cmake
  --help-policy CMP0146" for policy details.  Use the cmake_policy command to
  set the policy and suppress this warning.

Call Stack (most recent call first):
  CMakeLists.txt:157 (include)
This warning is for project developers.  Use -Wno-dev to suppress it.

-- Found CUDA: /usr/local/cuda (found version "12.2") 
-- No result from nvcc so building for 2.0
-- CUDA Detected CUDA_ARCH : 75
-- CUDA_VERSION: 12.2
-- CUDA root path : /usr/local/cuda
-- CUDA include path : /usr/local/cuda/include
-- CUDA runtime libraries : 
-- CUDA driver libraries : /usr/local/cuda/lib64/stubs/libcuda.so
-- CUBLAS libraries : /usr/local/cuda/lib64/libcublas.so
-- CURAND libraries : /usr/local/cuda/lib64/libcurand.so
-- CUDA Arch : 75
-- CUDA_GENCODE: -gencode arch=compute_75,code=sm_75
-- CMAKE_CUDA_COMPILER: /usr/local/cuda/bin/nvcc
-- CUDNN include : /usr/include
-- CUDNN libraries : /usr/lib/x86_64-linux-gnu/libcudnn.so
-- Building Legion from source
-- GASNET ROOT: 
-- Found Git: /usr/bin/git (found version "2.34.1") 
-- Version string from git: legion-shardrefine-final-142-g24e8c4523
lihuahua123 commented 3 months ago

I solved this problem by 3 steps:

pip install . --verbose found a problem:

ERROR: setuptools==59.5.0 is used in combination with setuptools_scm>=8.x

  Your build configuration is incomplete and previously worked by accident!
  setuptools_scm requires setuptools>=61

Then I reinstall setuptools : pip install setuptools==61 but found another problem:

File "/tmp/pip-req-build-zu23ayuw/legion_cffi_build.py", line 39, in find_header
          raise Exception('Unable to locate header file:' + filename + " in:" + runtime_dir)
      Exception: Unable to locate header file:legion.h in:/runtime

I modify the file in ./deps/legion/bindings/python/legion_cffi_build.py where the root_dir is the root directory of legion.h

root_dir = "/home/server/FlexFlow/deps/legion/"#os.path.dirname(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))

It can build successfully. Then add the directory of legion_cffi.py to the PYTHONPATH.

echo "export PYTHONPATH=$PYTHONPATH:/usr/lib/python3.8/site-packages/" >> ~/.bashrc
souce ~/.bashrc