BlackSamorez / tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training & inference
MIT License
619 stars 38 forks source link

model.generate() with inputs_embeds #112

Closed ZhaoxuanWu closed 1 year ago

ZhaoxuanWu commented 1 year ago

Hi! A very easy-to-use library.

When I call model.generate(inputs_embeds=...) with inputs_embeds instead of input_ids, it does not seem to have been implemented.

*** ValueError: You passed `inputs_embeds` to `.generate()`, but the model class TensorParallelPreTrainedModel doesn't have its forwarding implemented. See the GPT2 implementation for an example (https://github.com/huggingface/transformers/pull/21405), and feel free to open a PR with it!

Can we have this feature? Thank you!

BlackSamorez commented 1 year ago

It looks like transformers relies of forward pass signature inspection to determine whether it's okay to use inputs_embeds. Since TensorParallelPreTrainedModel's forward's signature is (self, *args, **kwargs), it becomes a false negative. I'll try and think of a workaround.

has_inputs_embeds_forwarding = "inputs_embeds" in set(
    inspect.signature(self.prepare_inputs_for_generation).parameters.keys()
)
BlackSamorez commented 1 year ago

Working on it in #113

ZhaoxuanWu commented 1 year ago

It works like a charm. Many thanks!