huggingface-cli login error when building quantized model

xngli commented 4 weeks ago

Describe the bug I'm trying to quantize a fine-tuned Phi-3 model using builder.py and I'm running into the error below

Looking at the error message it's because the configuration_phi3.py file is missing from my fine-tuned model folder, so HuggingFace is trying to download it. However on line 1615 and line 2473 of the builder.py the use_auth_token is set to True which requires local HuggingFace login token.

config = AutoConfig.from_pretrained(hf_name, use_auth_token=True, trust_remote_code=True, **extra_kwargs)
model = AutoModelForCausalLM.from_pretrained(self.model_name_or_path, cache_dir=self.cache_dir, use_auth_token=True, trust_remote_code=True, **extra_kwargs)

After setting use_auth_token=False I was able to avoid this error.

To Reproduce Steps to reproduce the behavior:

Download builder.py curl https://raw.githubusercontent.com/microsoft/onnxruntime-genai/main/src/python/py/models/builder.py -o builder.py
Run the quantization python builder.py -i <fine-tuned-model-folder> -o quantized_model -p int4 -e cpu

Expected behavior No huggingface_hub token error is thrown if I'm using a public model like Phi-3. Can we change to use_auth_token=False by default?

Environment

OS: MacOS
onnx==1.16.2
onnxruntime==1.18.1

yufenglee commented 3 weeks ago

Thanks! We can optimize the option a little bit.

jambayk commented 2 weeks ago

Also to note:

use_auth_token has already been deprecated in favor of token
I think it would also be helpful if trust_remote_code can be configurable too. It is not required for models that are already supported by the transformers package.

nmoeller commented 2 days ago

Hey :)

just wanna say i am also very intrested in this. I am currently building a AML Pipeline that trains a Model and then automatically converts it to Onnx. After the conversion it will be evaluted in the pipeline aswell.

I am running in the same issuse here, that i cannot automate the conversion here i guess.

Would i make sense to make a PR here ? I think we just have to check if a local path is used or not, if a local path is used we set use_auth_token=False otherwise we set it to use_auth_token=True

yufenglee commented 1 day ago

@nmoeller, yes, you're very welcome to make a PR. In addition, it would be great to make the use_auth_token or use_token as an option to support the download case.

microsoft / onnxruntime-genai

huggingface-cli login error when building quantized model #830