Closed keyboardAnt closed 2 weeks ago
I was able to reproduce the issue even without tokenizing:
import torch
from transformers import AutoModelForCausalLM
from multiprocess import Process, Queue
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
model = AutoModelForCausalLM.from_pretrained("gpt2")
tok_ids = torch.tensor([[15205, 541, 305, 919, 278, 351, 12905, 2667, 15399, 714, 307, 281, 220]])
def fwd(model, tok_ids, queue):
print("Starting process")
print(f"{os.environ['TOKENIZERS_PARALLELISM']=}")
print(f"{type(model)=}")
print(f"{tok_ids=}")
try:
outs = model(tok_ids)
except Exception as e:
print(f"Error: {e}")
print(f"{outs=}")
queue.put(outs)
queue = Queue()
pr = Process(target=fwd, args=(model, tok_ids, queue))
pr.start()
pr.join()
outs = queue.get()
print(outs)
Hi @keyboardAnt 👋 Thank you for opening this issue 🤗
This is a torch
-level issue, nothing we can do :) See https://pytorch.org/docs/master/notes/multiprocessing.html
(P.S.: in case you haven't considered it, have a look at input batching. If you don't know what batching is, check our course)
Thanks for the prompt reply @gante!
Are there any official examples of using transformers with torch.multiprocessing
? I'm working on something for which batching isn't beneficial, and simply substituting multiprocessing
with torch.multiprocessing
didn't resolve the issue:
import torch
from transformers import AutoModelForCausalLM
from torch.multiprocessing import Process, Queue
import os
os.environ["TOKENIZERS_PARALLELISM"] = "false"
model = AutoModelForCausalLM.from_pretrained("gpt2")
tok_ids = torch.tensor([[15205, 541, 305, 919, 278, 351, 12905, 2667, 15399, 714, 307, 281, 220]])
def fwd(model, tok_ids, queue):
print("Starting process")
print(f"{os.environ['TOKENIZERS_PARALLELISM']=}")
print(f"{type(model)=}")
print(f"{tok_ids=}")
try:
outs = model(tok_ids)
except Exception as e:
print(f"Error: {e}")
print(f"{outs=}")
queue.put(outs)
queue = Queue()
pr = Process(target=fwd, args=(model, tok_ids, queue))
pr.start()
pr.join()
outs = queue.get()
print(outs)
Are there any official examples of using transformers with torch.multiprocessing?
Not that I know of :(
Hi @gante,
Thank you for your suggestions and guidance. To clarify, our project requires the ability to preempt (terminate) a forward pass of the transformers
models during execution to free up GPU resources when needed. We considered using multiprocessing
because running the model in a separate process seemed to offer a straightforward way to kill the process if necessary, thus terminating the model's execution and freeing the GPU.
However, as observed, the transformers
model processes tend to get stuck when using multiprocessing
. If there are alternative approaches to achieve this preemptive functionality without relying on multiprocessing, we would be very keen to explore those. Could you please provide any insights or guidance on how we might implement such functionality within the Transformers library or with PyTorch?
Your expertise and advice would be greatly appreciated as we navigate this challenge.
Thank you!
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@keyboardAnt I have no good alternative suggestions, unfortunately :(
Running model forwards within a process seems to get stuck. I tried to set
TOKENIZERS_PARALLELISM
totrue
andfalse
but unfortunately both couldn't help 🥲System Info
transformers-cli env
:Who can help?
@ArthurZucker @gante
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Minimal example:
prints
Expected behavior
Shouldn't get stuck.