huggingface / optimum-nvidia

Apache License 2.0
867 stars 86 forks source link

FileNotFoundError: [Errno 2] No such file or directory: '/data/Dilip/models/llama-2-7b-chat-hf/build.json' #47

Open dilip467 opened 8 months ago

dilip467 commented 8 months ago

Screenshot (43) Screenshot (44) Screenshot (45)

Quang-elec44 commented 8 months ago

@dilip467 I got the same error. It seems that the engine will be built when you pull the model from HF hub. For local model, you have to build it using the file examples/text-generation/llama.py. It works with my model

dilip467 commented 8 months ago

@Quang-elec44 TensorRT-LLM Llama implementation: error: unrecognized arguments: output / I am getting error related output argument irrespective any path i give

Quang-elec44 commented 8 months ago

@dilip467 Here is what I did: python llama.py TinyLlama/TinyLlama-1.1B-Chat-v0.3 tiny-llama-built --other-args

dilip467 commented 8 months ago

@Quang-elec44 I have downloaded llama-2-7b model and converted to llama-2-7b-hf model.

python llama.py /data/asr/Dilip/models/llama-2-7b-hf/ /data/asr/Dilip/models/llama_tensorrt_llm FileNotFoundError: Cannot find safetensors checkpoint for /data/asr/Dilip/models/llama-2-7b-hf/

Quang-elec44 commented 8 months ago

@dilip467 You should load the model using transformers and save it using save_pretrained method with the argument safe_serialization=True. After that, you will get the *.safetensors checkpoint

liangxuZhang commented 8 months ago

@dilip467 I got the same error. It seems that the engine will be built when you pull the model from HF hub. For local model, you have to build it using the file examples/text-generation/llama.py. It works with my model

@Quang-elec44 After building trt engine, Llama model always generates same token 55295. Have you ever encountered this problem? My build command is as follows

python llama.py  Chinese-Alpaca-2-7B/  llama/fp8 --max-batch-size 4 --max-prompt-length 256 --max-new-tokens 256 --fp8
Quang-elec44 commented 8 months ago

@liangxuZhang I'm not able to build the engine with --fp8 so I am not sure whether this argument causes the problem or not. You can try building the engine without --fp8 and then test again.

alokkrsahu commented 5 months ago

Hi Guys, I am getting the same error. Please let me know your suggestions on how to resolve it.

model = AutoModelForCausalLM.from_pretrained(

File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(*args, *kwargs) File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/hub_mixin.py", line 157, in from_pretrained return cls._from_pretrained( File "/opt/optimum-nvidia/src/optimum/nvidia/models/base.py", line 60, in _from_pretrained return model_class.from_pretrained( File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_validators.py", line 118, in _inner_fn return fn(args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/huggingface_hub/hub_mixin.py", line 157, in from_pretrained return cls._from_pretrained( File "/opt/optimum-nvidia/src/optimum/nvidia/runtime.py", line 157, in _from_pretrained with open(engine_folder.joinpath(OPTIMUM_NVIDIA_CONFIG_FILE), "r") as trt_config_f: FileNotFoundError: [Errno 2] No such file or directory: '/save/nvoptimum/model/build.json'