huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.22k stars 26.84k forks source link

Can Transformers AutoModel Support to(dml) #30109

Open rheros opened 6 months ago

rheros commented 6 months ago

Feature request

Windows
RX6800XT Rocm use DirectML to speed up Can AutoModel.to() support DirectML

Motivation

it useful for me to use glm

Your contribution

stable diffusion support DirectML

amyeroberts commented 6 months ago

Hi @rheros,

To make sure I've understood - is the request to enable using DirectML for acceleration when loading any transformers models with AutoXXX?

cc'ing @muellerzr and @pacman100 and this seems possibly more aligned with accelerate

rheros commented 6 months ago

Hi @amyeroberts My current status is: I hope to use GLM3 on the Windows platform and simply deploy and run it without training. Then my graphics card is an AMD graphics card, and currently PyTorch on Windows does not support Rcom. I found Microsoft's DirectML online, which allows for GPU acceleration of operations. I previously downloaded a DirectML version repository for StableDiffusion, which can be accelerated using DirectML on the Windows platform and runs much faster than the CPU. I recently downloaded GLM and plan to deploy and run it locally, but this library can only be executed using the CPU on my Windows+AMD graphics card computer, and the speed is very slow Then I saw a getmodel method in the web_demo_streamlit.py in its warehouse's BasicDemo Def get_model(): Tokenizer=Autotokenizer. from_pretrained (TOKENIZER-PATH, trust-remote_code=True) Model=AutoModel. free_pretrained (MODEL_PATH, trust-remote_code=True). eval() Return token, model Then try using Import torch_directorml Dml=torch_direct ml. device() AutoModel. from retrained (MODEL_PATH, trust-remote_code=True). to (dml). eval() Execution time report: expected key in DispatchKeySet (CPU, CUDA, HIP, XLA, MPS, IPU, XPU, HPU, Lazy, Meta) but got: PrivateUse1 I have contacted GLM's repositories and they said they have no plans to support DirectML. Recently, I have been studying Transformers and saw this link in the document, so I came to inquire about it. Thank you very much for your kind help~