lyogavin / airllm

AirLLM 70B inference with single 4GB GPU
Apache License 2.0
4.05k stars 335 forks source link

#169: fixed error when running on cpu and added post install command to upgrade transformers #170

Closed NavodPeiris closed 4 weeks ago

NavodPeiris commented 1 month ago

addressing issue: #169

changelog:

  1. reading README.md file in setup.py with utf-8 encoding as some systems may use different default text encoding
  2. added checking for device type to ensure only cuda supported devices will use torch.cuda.Stream() and pin_memory(). this will resolve errors that occur during inferencing on cpu only devices.
  3. added post install command to upgrade transformers to latest version to ensure "ValueError: rope_scaling must be a dictionary with two fields" error does not occur.
  4. added the support for non sharded models on huggingface repo which are just single model files like 'model.safetensors'.