flowersteam / lamorel

Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
MIT License
193 stars 18 forks source link

Connection closed by peer [127.0.1.1]: 14734 #27

Closed ewanlee closed 11 months ago

ewanlee commented 11 months ago

Hello! when I run the following script:

python -m lamorel_launcher.launch \
--config-path /home/ewanlee/Codes/lamorel/examples/PPO_LoRA_finetuning/ \
--config-name local_gpu_config \
rl_script_args.path=/home/ewanlee/Codes/lamorel/examples/PPO_LoRA_finetuning/main.py \
rl_script_args.output_dir=/home/ewanlee/Codes/lamorel/examples/PPO_LoRA_finetuning/logs \

I have encountered the following issues:

image image image image

My local config is:

lamorel_args:
  log_level: info
  allow_subgraph_use_whith_gradient: false
  distributed_setup_args:
    n_rl_processes: 1
    n_llm_processes: 1
  accelerate_args:
    config_file: ../configs/accelerate/default_config.yaml
    machine_rank: 0
    main_process_ip: 127.0.0.1
    num_machines: 1
  llm_args:
    model_type: seq2seq
    # model_path: google/flan-t5-small
    model_path: t5-small
    pretrained: true
    minibatch_size: 192
    pre_encode_inputs: true
    load_in_4bit: false
    parallelism:
      use_gpu: true
      model_parallelism_size: 2
      synchronize_gpus_after_scoring: false
      empty_cuda_cache_after_scoring: false
rl_script_args:
  path: ???
  seed: 1
  # ppo
  ppo_epochs: 4
  lam: 0.99
  gamma: 0.99
  lr: 1e-6
  entropy_coef: 0.01
  value_loss_coef: 0.5
  clip_eps: 0.2
  max_grad_norm: 0.5
  minibatch_size: 8
  # llm
  gradient_batch_size: 1
  gradient_minibatch_size:
  ## LoRA
  use_lora: true
  lora_r: 16
  lora_alpha: 32
  # rl training
  number_envs: 2
  max_ep_len: 100
  epochs: 100
  steps_per_epoch: 10 #256
  save_freq: 10
  output_dir: ???
  loading_path:
  # environment
  task: 'BabyAI-GoToRedBall-v0'
  action_space: [ "turn_left","turn_right","go_forward","pick_up","drop","toggle" ]

And my env yaml is shown as follows:

name: dlp
channels:
  - pytorch
  - nvidia
  - conda-forge
  - https://mirrors.tuna.tsinghua.edu.cn/anaconda/pkgs/free/
dependencies:
  - _libgcc_mutex=0.1=conda_forge
  - _openmp_mutex=4.5=2_kmp_llvm
  - accelerate=0.24.1=pyhd8ed1ab_0
  - aiohttp=3.8.6=py310h2372a71_1
  - aiosignal=1.3.1=pyhd8ed1ab_0
  - async-timeout=4.0.3=pyhd8ed1ab_0
  - attrs=23.1.0=pyh71513ae_1
  - aws-c-auth=0.7.7=h37ad1db_0
  - aws-c-cal=0.6.9=h3b91eb8_1
  - aws-c-common=0.9.8=hd590300_0
  - aws-c-compression=0.2.17=hfd9eb17_6
  - aws-c-event-stream=0.3.2=hae413d4_6
  - aws-c-http=0.7.14=h162056d_1
  - aws-c-io=0.13.35=hc23c90e_8
  - aws-c-mqtt=0.9.9=h1387108_0
  - aws-c-s3=0.3.24=hdb3bed3_1
  - aws-c-sdkutils=0.1.12=hfd9eb17_5
  - aws-checksums=0.1.17=hfd9eb17_5
  - aws-crt-cpp=0.24.7=hd0f6be0_2
  - aws-sdk-cpp=1.11.182=h8beafcf_7
  - blas=2.116=mkl
  - blas-devel=3.9.0=16_linux64_mkl
  - brotli-python=1.1.0=py310hc6cd4ac_1
  - bzip2=1.0.8=hd590300_5
  - c-ares=1.22.1=hd590300_0
  - ca-certificates=2023.11.17=hbcca054_0
  - certifi=2023.11.17=pyhd8ed1ab_0
  - charset-normalizer=3.3.2=pyhd8ed1ab_0
  - click=8.1.7=unix_pyh707e725_0
  - colorama=0.4.6=pyhd8ed1ab_0
  - cuda-cudart=11.8.89=0
  - cuda-cupti=11.8.87=0
  - cuda-libraries=11.8.0=0
  - cuda-nvrtc=11.8.89=0
  - cuda-nvtx=11.8.86=0
  - cuda-runtime=11.8.0=0
  - cudatoolkit=11.3.1=hb98b00a_12
  - dataclasses=0.8=pyhc8e2a94_3
  - dill=0.3.7=pyhd8ed1ab_0
  - ffmpeg=4.3=hf484d3e_0
  - filelock=3.13.1=pyhd8ed1ab_0
  - freetype=2.12.1=h267a509_2
  - frozenlist=1.4.0=py310h2372a71_1
  - fsspec=2023.10.0=pyhca7485f_0
  - gflags=2.2.2=he1b5a44_1004
  - glog=0.6.0=h6f12383_0
  - gmp=6.2.1=h58526e2_0
  - gmpy2=2.1.2=py310h3ec546c_1
  - gnutls=3.6.13=h85f3911_1
  - huggingface_hub=0.17.3=pyhd8ed1ab_0
  - icu=73.2=h59595ed_0
  - idna=3.4=pyhd8ed1ab_0
  - importlib-metadata=6.8.0=pyha770c72_0
  - importlib_metadata=6.8.0=hd8ed1ab_0
  - jinja2=3.1.2=pyhd8ed1ab_1
  - joblib=1.3.2=pyhd8ed1ab_0
  - keyutils=1.6.1=h166bdaf_0
  - krb5=1.21.2=h659d440_0
  - lame=3.100=h166bdaf_1003
  - lcms2=2.15=hb7c19ff_3
  - ld_impl_linux-64=2.40=h41732ed_0
  - lerc=4.0.0=h27087fc_0
  - libabseil=20230802.1=cxx17_h59595ed_0
  - libarrow=14.0.1=h4df1b6a_3_cpu
  - libarrow-acero=14.0.1=h59595ed_3_cpu
  - libarrow-dataset=14.0.1=h59595ed_3_cpu
  - libarrow-flight=14.0.1=h120cb0d_3_cpu
  - libarrow-flight-sql=14.0.1=h61ff412_3_cpu
  - libarrow-gandiva=14.0.1=hacb8726_3_cpu
  - libarrow-substrait=14.0.1=h61ff412_3_cpu
  - libblas=3.9.0=16_linux64_mkl
  - libbrotlicommon=1.1.0=hd590300_1
  - libbrotlidec=1.1.0=hd590300_1
  - libbrotlienc=1.1.0=hd590300_1
  - libcblas=3.9.0=16_linux64_mkl
  - libcrc32c=1.1.2=h9c3ff4c_0
  - libcublas=11.11.3.6=0
  - libcufft=10.9.0.58=0
  - libcufile=1.8.1.2=0
  - libcurand=10.3.4.101=0
  - libcurl=8.4.0=hca28451_0
  - libcusolver=11.4.1.48=0
  - libcusparse=11.7.5.86=0
  - libdeflate=1.19=hd590300_0
  - libedit=3.1.20191231=he28a2e2_2
  - libev=4.33=h516909a_1
  - libevent=2.1.12=hf998b51_1
  - libffi=3.4.2=h7f98852_5
  - libgcc-ng=13.2.0=h807b86a_2
  - libgfortran-ng=13.2.0=h69a702a_2
  - libgfortran5=13.2.0=ha4646dd_2
  - libgomp=13.2.0=h807b86a_2
  - libgoogle-cloud=2.12.0=h5206363_4
  - libgrpc=1.59.3=hd6c4280_0
  - libhwloc=2.9.3=default_h554bfaf_1009
  - libiconv=1.17=h166bdaf_0
  - libjpeg-turbo=3.0.0=hd590300_1
  - liblapack=3.9.0=16_linux64_mkl
  - liblapacke=3.9.0=16_linux64_mkl
  - libllvm15=15.0.7=h5cf9203_3
  - libnghttp2=1.58.0=h47da74e_0
  - libnpp=11.8.0.86=0
  - libnsl=2.0.1=hd590300_0
  - libnuma=2.0.16=h0b41bf4_1
  - libnvjpeg=11.9.0.86=0
  - libparquet=14.0.1=h352af49_3_cpu
  - libpng=1.6.39=h753d276_0
  - libprotobuf=4.24.4=hf27288f_0
  - libre2-11=2023.06.02=h7a70373_0
  - libsqlite=3.44.0=h2797004_0
  - libssh2=1.11.0=h0841786_0
  - libstdcxx-ng=13.2.0=h7e041cc_2
  - libthrift=0.19.0=hb90f79a_1
  - libtiff=4.6.0=ha9c0a0a_2
  - libutf8proc=2.8.0=h166bdaf_0
  - libuuid=2.38.1=h0b41bf4_0
  - libwebp-base=1.3.2=hd590300_0
  - libxcb=1.15=h0b41bf4_0
  - libxml2=2.11.5=h232c23b_1
  - libzlib=1.2.13=hd590300_5
  - llvm-openmp=15.0.7=h0cdce71_0
  - lz4-c=1.9.4=hcb278e6_0
  - markupsafe=2.1.3=py310h2372a71_1
  - mkl=2022.1.0=h84fe81f_915
  - mkl-devel=2022.1.0=ha770c72_916
  - mkl-include=2022.1.0=h84fe81f_915
  - mpc=1.3.1=hfe3b2da_0
  - mpfr=4.2.1=h9458935_0
  - mpmath=1.3.0=pyhd8ed1ab_0
  - multidict=6.0.4=py310h2372a71_1
  - multiprocess=0.70.15=py310h2372a71_1
  - ncurses=6.4=h59595ed_2
  - nettle=3.6=he412f7d_0
  - networkx=3.2.1=pyhd8ed1ab_0
  - numpy=1.23.1=py310h53a5b5f_0
  - openh264=2.1.1=h780b84a_0
  - openjpeg=2.5.0=h488ebb8_3
  - openssl=3.1.4=hd590300_0
  - orc=1.9.0=h4b38347_4
  - packaging=23.2=pyhd8ed1ab_0
  - peft=0.6.2=pyhd8ed1ab_0
  - pillow=10.1.0=py310h01dd4db_0
  - pip=23.3.1=pyhd8ed1ab_0
  - pthread-stubs=0.4=h36c2ea0_1001
  - pyarrow-hotfix=0.5=pyhd8ed1ab_0
  - pysocks=1.7.1=pyha2e5f31_6
  - python=3.10.8=h4a9ceb5_0_cpython
  - python-dateutil=2.8.2=pyhd8ed1ab_0
  - python-tzdata=2023.3=pyhd8ed1ab_0
  - python-xxhash=3.4.1=py310h2372a71_0
  - python_abi=3.10=4_cp310
  - pytorch=2.1.0=py3.10_cuda11.8_cudnn8.7.0_0
  - pytorch-cuda=11.8=h7e8668a_5
  - pytorch-mutex=1.0=cuda
  - pytz=2023.3.post1=pyhd8ed1ab_0
  - pyyaml=6.0.1=py310h2372a71_1
  - rdma-core=28.9=h59595ed_1
  - re2=2023.06.02=h2873b5e_0
  - readline=8.2=h8228510_1
  - regex=2023.10.3=py310h2372a71_0
  - requests=2.31.0=pyhd8ed1ab_0
  - s2n=1.3.56=h06160fa_0
  - sacremoses=0.0.53=pyhd8ed1ab_0
  - setuptools=68.2.2=pyhd8ed1ab_0
  - six=1.16.0=pyh6c4a22f_0
  - snappy=1.1.10=h9fff704_0
  - sympy=1.12=pypyh9d50eac_103
  - tbb=2021.10.0=h00ab1b0_2
  - tk=8.6.13=noxft_h4845f30_101
  - tokenizers=0.14.1=py310h320607d_2
  - torchaudio=2.1.0=py310_cu118
  - torchtriton=2.1.0=py310
  - torchvision=0.16.0=py310_cu118
  - transformers=4.35.2=pyhd8ed1ab_0
  - typing-extensions=4.8.0=hd8ed1ab_0
  - typing_extensions=4.8.0=pyha770c72_0
  - tzdata=2023c=h71feb2d_0
  - ucx=1.15.0=h64cca9d_0
  - urllib3=2.1.0=pyhd8ed1ab_0
  - wheel=0.41.3=pyhd8ed1ab_0
  - xorg-libxau=1.0.11=hd590300_0
  - xorg-libxdmcp=1.1.3=h7f98852_0
  - xxhash=0.8.2=hd590300_0
  - xz=5.2.6=h166bdaf_0
  - yaml=0.2.5=h7f98852_2
  - yarl=1.9.2=py310h2372a71_1
  - zipp=3.17.0=pyhd8ed1ab_0
  - zlib=1.2.13=hd590300_5
  - zstd=1.5.5=hfc55251_0
  - pip:
      - absl-py==2.0.0
      - annotated-types==0.6.0
      - antlr4-python3-runtime==4.9.3
      - anyio==3.7.1
      - appdirs==1.4.4
      - asttokens==2.4.1
      - blosc==1.11.1
      - cachetools==5.3.2
      - cloudpickle==3.0.0
      - contourpy==1.2.0
      - cycler==0.12.1
      - datasets==2.14.6
      - decorator==5.1.1
      - distro==1.8.0
      - docker-pycreds==0.4.0
      - exceptiongroup==1.1.3
      - executing==2.0.1
      - fonttools==4.44.0
      - gitdb==4.0.11
      - gitpython==3.1.40
      - gloo==0.1.2
      - google-auth==2.23.4
      - google-auth-oauthlib==0.4.6
      - grpcio==1.59.2
      - gym==0.26.1
      - gym-notices==0.0.8
      - h11==0.14.0
      - httpcore==1.0.1
      - httpx==0.25.1
      - hydra-core==1.3.2
      - imageio==2.32.0
      - ipython==8.17.2
      - jedi==0.19.1
      - kiwisolver==1.4.5
      - markdown==3.5.1
      - matplotlib==3.8.1
      - matplotlib-inline==0.1.6
      - oauthlib==3.2.2
      - omegaconf==2.3.0
      - openai==1.2.2
      - pandas==2.1.2
      - parso==0.8.3
      - pexpect==4.8.0
      - prettytable==3.9.0
      - prompt-toolkit==3.0.40
      - protobuf==3.20.3
      - psutil==5.9.6
      - ptyprocess==0.7.0
      - pure-eval==0.2.2
      - pyarrow==14.0.1
      - pyasn1==0.5.0
      - pyasn1-modules==0.3.0
      - pydantic==2.4.2
      - pydantic-core==2.10.1
      - pygments==2.16.1
      - pyparsing==3.1.1
      - pypng==0.20220715.0
      - qrcode==7.4.2
      - requests-oauthlib==1.3.1
      - rsa==4.9
      - safetensors==0.4.0
      - scipy==1.11.3
      - sentencepiece==0.1.99
      - sentry-sdk==1.34.0
      - setproctitle==1.3.3
      - smmap==5.0.1
      - sniffio==1.3.0
      - stack-data==0.6.3
      - tensorboard==2.7.0
      - tensorboard-data-server==0.6.1
      - tensorboard-plugin-wit==1.8.0
      - tensorboardx==1.8
      - termcolor==2.3.0
      - tqdm==4.64.0
      - traitlets==5.13.0
      - wandb==0.16.0
      - wcwidth==0.2.9
      - werkzeug==3.0.1
prefix: /home/ewanlee/miniforge3/envs/dlp

Could you please advise me on how to resolve the error?

ClementRomac commented 11 months ago

Hi !

Thank you for opening this issue. So first of all let me say that there is kind of no error :) Your script ran until the end (100 PPO updates). The exception you had occurred when stopping the processes: you reached the end of the main function and the program ended while the LLM server was still listening and waiting for calls.

It seems I forgot to add lm_server.close() at the end of the main function, which should properly stop all the processes.

ewanlee commented 11 months ago

Oh I see! Thank you so much for your prompt response and thorough explanation! I appreciate your assistance and the time you've taken to help me. Keep up the great work on this project!