jina-ai / jerboa

LLM finetuning
Apache License 2.0
41 stars 4 forks source link

fix: set environment variable for cuda #104

Closed sebastian-weisshaar closed 1 year ago

sebastian-weisshaar commented 1 year ago

With the new version of transformers we cannot send the model to a device if we use quantization (https://github.com/huggingface/transformers/blob/66954ea25e342fd451c26ec1c295da0b8692086b/src/transformers/modeling_utils.py#L1897). To solve this we had to specify the accelerate version, instead of using the main branch on GitHub.

This PR includes: https://github.com/jina-ai/jerboa/pull/102.

THIS CODE DOES NOT RUN ON APPLE SILICON

sebastian-weisshaar commented 1 year ago

Instead of pinning an older version of Transformers, is there a different/more proper way to handle this? They probably block this in the new version for a reason, so seems like a bit of a code smell to keep using to() on quantized models. Do they not have any guidance on what to do?

We do not use .to(device). Somewhere deep in transformers they call it when loading the model. Tried to workaround with the device_map but this did not stop transformers from calling .to(device)

We pin an older version of accelerate