bigscience-workshop / petals

🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
https://petals.dev
MIT License
8.98k stars 498 forks source link

The prompt tuning example (prompt-tuning-sst2) don't work #241

Open Gad1001 opened 1 year ago

Gad1001 commented 1 year ago

Hi, I have tried to run the notebook example you have published (without editing) in colab but it doesn't work... I get the following error:

RuntimeError                              Traceback (most recent call last)
[<ipython-input-12-7f7d7fa267e9>](https://localhost:8080/#) in <module>
     17 
     18         model.train()
---> 19         outputs = model(**batch)
     20         loss = outputs.loss
     21         loss.backward()

3 frames
[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: expected scalar type Half but found Float

It is probably about the transformers version out of date but petals 1.1.1 requires transformers==4.25.1

justheuristic commented 1 year ago

@artek0chumak can you please check?

justheuristic commented 1 year ago

While @artek0chumak is on his way, you might want a more basic example here: https://github.com/bigscience-workshop/petals/issues/233

Just paste the code in the basic notebook

artek0chumak commented 1 year ago

Hi!

This issue was fixed in the repo, but it's still not in the "pip version".

Please install petails via: pip install https://github.com/bigscience-workshop/petals/archive/main.zip

... and the issue should be resolved. If not, please tell us.

borzunov commented 1 year ago

I have just shipped the new PyPI release (v1.1.2), the notebooks should now work as is.

Gad1001 commented 1 year ago

Hi, the issue has been solved, but now another error appears:

AttributeError: module 'torch.nn' has no attribute 'CrossEntoryCriterion'

     12 lr_scheduler = get_scheduler(
     11 
---> 10 cls_criterion = nn.CrossEntoryCriterion()
      9 cls_optimizer = AdamW(cls_model.parameters(), lr=LR, weight_decay=WEIGHT_DECAY)
      8 )
[<ipython-input-13-c8610e78a6af>](https://localhost:8080/#) in <module>
AttributeError                            Traceback (most recent call last)
---------------------------------------------------------------------------
justheuristic commented 1 year ago

@borzunov notes that it should be CrossEntropyCriterion - it's a typo.

Looks like this notebook wasn't updated for a while, I will run it start-to-end and check that everything works

Gad1001 commented 1 year ago

@borzunov notes that it should be CrossEntropyCriterion - it's a typo.

Looks like this notebook wasn't updated for a while, I will run it start-to-end and check that everything works

I see, i would be happy if you could update here when the notebook is updated.

BTW i thought so too but nn.CrossEntropyCriterion doesn't exist either.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
[<ipython-input-13-e888bbe81449>](https://localhost:8080/#) in <module>
      8 )
      9 cls_optimizer = AdamW(cls_model.parameters(), lr=LR, weight_decay=WEIGHT_DECAY)
---> 10 cls_criterion = nn.CrossEntropyCriterion()
     11 
     12 lr_scheduler = get_scheduler(

AttributeError: module 'torch.nn' has no attribute 'CrossEntropyCriterion'
Gad1001 commented 1 year ago

BTW I guess that you meant to nn.functional.cross_entropy and also, you should add cls_model.to(DEVICE) and also replace model.transformers.word_embeddings to model.transformer.word_embeddings (with on s)

justheuristic commented 1 year ago

Hi! We recently updated the notebook to fix both issues you reported in #247. Thanks a lot for pointing them out!

borzunov commented 1 year ago

...and one more fix was merged with #248.

Gad1001 commented 1 year ago

Hi, i steel get the following error.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-9-babcb7d15e4d>](https://localhost:8080/#) in <module>
     17 
     18         model.train()
---> 19         outputs = model(**batch)
     20         loss = outputs.loss
     21         loss.backward()

3 frames
[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: expected scalar type Half but found Float

If i upgrade transformers to version 4.26.0 (petals 1.1.2 requires transformers==4.25.1) I can run the first half of the colab notebook But the second half of the notebook called "Beyond soft-prompt tuning" I get the following error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
[<ipython-input-15-d602ecd6510b>](https://localhost:8080/#) in <module>
     21         with torch.no_grad():
     22             embeddings_output = cls_model.word_embeddings(batch["input_ids"])
---> 23         outputs = cls_model(embeddings_output)
     24         loss = cls_criterion(outputs, batch["labels"])
     25         loss.backward()

5 frames
[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1192         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194             return forward_call(*input, **kwargs)
   1195         # Do not call functions when jit is used
   1196         full_backward_hooks, non_full_backward_hooks = [], []

[<ipython-input-11-6f6621435734>](https://localhost:8080/#) in forward(self, embeddings)
     33 
     34     hidden_states = before_layers(embeddings)
---> 35     hidden_states = self.adapter(hidden_states)
     36     hidden_states = after_layers(hidden_states)
     37     pooled_states = torch.mean(hidden_states, dim=1)

[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1192         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194             return forward_call(*input, **kwargs)
   1195         # Do not call functions when jit is used
   1196         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py](https://localhost:8080/#) in forward(self, input)
    202     def forward(self, input):
    203         for module in self:
--> 204             input = module(input)
    205         return input
    206 

[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py](https://localhost:8080/#) in _call_impl(self, *input, **kwargs)
   1192         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1193                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1194             return forward_call(*input, **kwargs)
   1195         # Do not call functions when jit is used
   1196         full_backward_hooks, non_full_backward_hooks = [], []

[/usr/local/lib/python3.8/dist-packages/torch/nn/modules/linear.py](https://localhost:8080/#) in forward(self, input)
    112 
    113     def forward(self, input: Tensor) -> Tensor:
--> 114         return F.linear(input, self.weight, self.bias)
    115 
    116     def extra_repr(self) -> str:

RuntimeError: mat1 and mat2 must have the same dtype
vrosca commented 1 year ago

Any updates on this? Running into the exact same problem

Gad1001 commented 1 year ago

Any updates on this?

mryab commented 1 year ago

Hi @Gad1001 and @vrosca, we've just committed a fix for the SST-2 prompt tuning notebook in https://github.com/bigscience-workshop/petals/pull/343. Can you try rerunning the updated notebook from the main branch?