huggingface / safetensors

Simple, safe way to store and distribute tensors
https://huggingface.co/docs/safetensors
Apache License 2.0
2.7k stars 178 forks source link

pytorch: safetensors library hardcodes using CUDA if only device index is provided #499

Closed dvrogozh closed 2 weeks ago

dvrogozh commented 1 month ago

In relevance to:

safetensors library hardcodes returning CUDA device if only device index is provided. This causes runtime errors running huggingface models with pipeline(device_map="auto") as noted in https://github.com/huggingface/transformers/issues/31941 (see this issue for repro steps). Hardcoding is happening here: https://github.com/huggingface/safetensors/blob/079781fd0dc455ba0fe851e2b4507c33d0c0d407/bindings/python/src/lib.rs#L296-L297

A possible solution might be to return the device returned by torch.device(N). Note however that this will work for non-CUDA devices only after the following change in pytorch will be merged:

CC: @faaany @muellerzr @SunMarc @guangyey

dvrogozh commented 1 month ago

FYI, https://github.com/pytorch/pytorch/pull/129119 got merged, so solution which I outlined should now be possible.

dvrogozh commented 1 month ago

I have implemented a fix for this issue as I do see it. Please, help review https://github.com/huggingface/safetensors/pull/500.

Narsil commented 2 weeks ago

Closed by #509