Open coranholmes opened 1 month ago
I manage to make it run but I am getting the following error:
usage: prepare.py [-h] --save_dir SAVE_DIR [--benchmark BENCHMARK] --task TASK [--subset SUBSET] --tokenizer_path TOKENIZER_PATH [--tokenizer_type TOKENIZER_TYPE] --max_seq_length MAX_SEQ_LENGTH
[--num_samples NUM_SAMPLES] [--random_seed RANDOM_SEED] [--model_template_type MODEL_TEMPLATE_TYPE] [--remove_newline_tab] [--chunk_idx CHUNK_IDX] [--chunk_amount CHUNK_AMOUNT]
prepare.py: error: argument --tokenizer_type: expected one argument
[NeMo W 2024-09-24 15:27:02 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
Predict niah_single_1
from benchmark_root/jamba_13k_v1/synthetic/4096/data/niah_single_1/validation.jsonl
to benchmark_root/jamba_13k_v1/synthetic/4096/pred/niah_single_1.jsonl
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/utils/manifest_utils.py", line 476, in read_manifest
f = open(manifest.get(), 'r', encoding='utf-8')
FileNotFoundError: [Errno 2] No such file or directory: 'benchmark_root/jamba_13k_v1/synthetic/4096/data/niah_single_1/validation.jsonl'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/mnt/afs/xxx/Codes/RULER/scripts/pred/call_api.py", line 333, in <module>
main()
File "/mnt/afs/xxx/Codes/RULER/scripts/pred/call_api.py", line 238, in main
data = read_manifest(task_file)
File "/usr/local/lib/python3.10/dist-packages/nemo/collections/asr/parts/utils/manifest_utils.py", line 478, in read_manifest
raise Exception(f"Manifest file could not be opened: {manifest}")
Exception: Manifest file could not be opened: <class 'nemo.utils.data_utils.DataStoreObject'>: store_path=benchmark_root/jamba_13k_v1/synthetic/4096/data/niah_single_1/validation.jsonl, local_path=benchmark_root/jamba_13k_v1/synthetic/4096/data/niah_single_1/validation.jsonl
usage: prepare.py [-h] --save_dir SAVE_DIR [--benchmark BENCHMARK] --task TASK [--subset SUBSET] --tokenizer_path TOKENIZER_PATH [--tokenizer_type TOKENIZER_TYPE] --max_seq_length MAX_SEQ_LENGTH
[--num_samples NUM_SAMPLES] [--random_seed RANDOM_SEED] [--model_template_type MODEL_TEMPLATE_TYPE] [--remove_newline_tab] [--chunk_idx CHUNK_IDX] [--chunk_amount CHUNK_AMOUNT]
prepare.py: error: argument --tokenizer_type: expected one argument
[NeMo W 2024-09-24 15:27:10 nemo_logging:349] /usr/local/lib/python3.10/dist-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning)
If you setup TOKENIZER_PATH
then you should also setup TOKENIZER_TYPE
. You can set TOKENIZER_TYPE=hf
. I setup for anyone who doesn't have TOKENIZER_PATH
in here.
I remove TOKENIZER_PATH=$MODEL_PATH
in the config_models.sh
and I finally manage to evaluate Jamba with RULER. I'll share some of the tips here. If you are using the image cphsieh/ruler:0.1.0 provided by the author, you need to make several changes to the environment:
undefined symbol: _ZN3c107WarningC1ENS_7variantIJNS0_11UserWarningENS0_18DeprecationWarningEEEERKNS_14SourceLocationENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEb
I manage to run with MODEL_FRAMEWORK="hf"
but not vllm, still not be able to figure out the right versions of different packages to run with vllm. I attach a complete pip list in case anyone needs it.
ruler_pip_list.txt
@coranholmes Can you try vllm with the latest container cphsieh/ruler:0.2.0
?
@hsiehjackson , I tried cphsieh/ruler:0.2.0
and got the following error:
ValueError: Fast Mamba kernels are not available. Make sure to they are installed and that the mamba module is on a CUDA device
I tried different versions of transformers, it didn't help. Any suggestions?
causal-conv1d==1.4.0
mamba-ssm==2.2.2
@dawenxi-007 Can you try pull the docker again? I update to be compatible with HF and vLLM.
@hsiehjackson
I built the docker with latest code repo
cd docker/ DOCKER_BUILDKIT=1 docker build -f Dockerfile -t cphsieh/ruler:0.2.0
If in the config_models.sh
, I set MODEL_FRAMEWORK="hf"
, I still got the following error:
ValueError: Fast Mamba kernels are not available. Make sure to they are installed and that the mamba module is on a CUDA device
If in the config_models.sh
, I set MODEL_FRAMEWORK="vllm"
, I got the following error:
ValueError: The number of required GPUs exceeds the total number of available GPUs in the placement group.
It seems vllm mode does not support TP.
@dawenxi-007 can you check whether you can see GPUs inside the docker container?
@hsiehjackson , yes, I forgot to enable the gpus argument. Now I can see the model is loading into the GPUs, however, I got the following OOM error: ` torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 2.00 GiB. GPU 0 has a total capacity of 39.38 GiB of which 1.07 GiB is free. Process 340734 has 38.30 GiB memory in use. Of the allocated memory 34.95 GiB is allocated by PyTorch, and 2.86 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)
` Jamba mini 1.5 is a 52B model, with a BF16, it should just occupy 100GB with some for the KV cache. I have 4xA100s (40G), not sure why.
Do you use HF or vLLM? If you are using vLLM, maybe you can first reduce max_position_embeddings
in the config.json since vLLM will occupy enough memory for running that length.
The scripts stuck at: [nltk_data] Downloading package punkt to /root/nltk_data...
But it works fine when I evaluate llama3-8b-instruct. I am wondering whether there is any setting I need to config for Jamba? I have already added the model in MODEL_SELCT
template.py