NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
https://nvidia.github.io/TensorRT-LLM
Apache License 2.0
8.34k stars 936 forks source link

convert_checkpoint.py: error: unrecognized arguments: --world_size #1110

Open mfournioux opened 7 months ago

mfournioux commented 7 months ago

I am converting a Mixtral8x7B with tensor parallelism using conversion script from llama folder :

python convert_checkpoint.py --model_dir ./Mixtral-8x7B-v0.1 \ --output_dir ./tllm_checkpoint_mixtral_2gpu \ --dtype float16 \ --world_size 2 \ --tp_size 2

An error appears about world_size argument :

convert_checkpoint.py: error: unrecognized arguments: --world_size 2

Should I remove this argument?

Many thanks for your help

byshiue commented 7 months ago

Yes. --world_size is removed. Please remove it.

HiddenPeak commented 6 months ago

After removing this argument, how do we configure world size?