Closed louis030195 closed 4 months ago
Try with --auto-convert false
.
This error happens when trying to convert to safetensors, but it shouldn't be required for non core models.
This model seems to be sharing it's gate_proj, however the modeling code doesn't reflect that: https://huggingface.co/baichuan-inc/baichuan-7B/blob/main/modeling_baichuan.py Not sure if it's intentional.
thanks @Narsil
tried auto-convert but its not in the args?
error: unexpected argument '--auto-convert' found
Usage: text-generation-launcher <--model-id
|--revision |--sharded |--num-shard |--quantize |--trust-remote-code|--max-concurrent-requests |--max-best-of |--max-stop-sequences |--max-input-length |--max-total-tokens |--max-batch-size |--waiting-served-ratio |--max-batch-total-tokens |--max-waiting-tokens |--port |--shard-uds-path |--master-addr |--master-port |--huggingface-hub-cache |--weights-cache-override |--disable-custom-kernels|--json-output|--otlp-endpoint |--cors-allow-origin |--watermark-gamma |--watermark-delta |--env>
yea, facing the same issue, the model gets converted to safe tensor and then then it messes it up, not really sure on how to figure that out
same, when I add --auto-convert false
it says the argument isn't found but when I try to run without it, it tries to convert the model to safe tensors and it returns this error
2023-07-03T19:27:56.259279Z WARN download: text_generation_launcher: No safetensors weights found for model /data/falcon-7b-instruct at revision None. Converting PyTorch weights to safetensors.
Error: DownloadError
2023-07-03T19:29:55.441248Z ERROR text_generation_launcher: Download process was signaled to shutdown with signal 9:
same here with falcon model
I also got --auto-convert not an argument. Would love to be able to use text-generation-inference on models which can't be converted to safetensors
Did anyone figure this out on how to use other architectures?
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.
In your readme you list optimised arch and say
Can you explain where we have to do this? I'm trying to run
baichuan-inc/baichuan-7B