Cannot save TFDebertaV2ForSequenceClassification as SavedModel via saved_model

maziyarpanahi commented 2 years ago

Environment info

transformers version: 4.17.0
Platform: Linux-5.4.144+-x86_64-with-Ubuntu-18.04-bionic
Python version: 3.7.13
PyTorch version (GPU?): 1.10.0+cu111 (False)
Tensorflow version (GPU?): 2.8.0 (False)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using GPU in script?:
Using distributed or parallel set-up in script?:

Who can help

@LysandreJik

Models:

DeBERTa-v2:

Information

Model I am using (Bert, XLNet ...): kamalkraj/deberta-v2-xlarge

The problem arises when using:

[x] the official example scripts: (give details below)
[ ] my own modified scripts: (give details below)

The tasks I am working on is:

[ ] an official GLUE/SQUaD task: (give the name)
[ ] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

Load model via TFDebertaV2ForSequenceClassification
Use saved_model=True to save as TensorFlow SavedModel

Reference: https://huggingface.co/docs/transformers/model_doc/deberta-v2#transformers.TFDebertaV2ForSequenceClassification


from transformers import DebertaV2Tokenizer, TFDebertaV2ForSequenceClassification
import tensorflow as tf

tokenizer = DebertaV2Tokenizer.from_pretrained("kamalkraj/deberta-v2-xlarge")
model = TFDebertaV2ForSequenceClassification.from_pretrained("kamalkraj/deberta-v2-xlarge")

inputs = tokenizer("Hello, my dog is cute", return_tensors="tf")
inputs["labels"] = tf.reshape(tf.constant(1), (-1, 1))  # Batch size 1

outputs = model(inputs)
loss = outputs.loss
logits = outputs.logits

model.save_pretrained("kamalkraj/deberta-v2-xlarge", saved_model=True)

---------------------------------------------------------------------------
OperatorNotAllowedInGraphError            Traceback (most recent call last)
[<ipython-input-4-7b1af514d387>](https://localhost:8080/#) in <module>()
----> 1 model.save_pretrained("kamalkraj/deberta-v2-xlarge", saved_model=True)

3 frames
[/usr/local/lib/python3.7/dist-packages/transformers/modeling_tf_utils.py](https://localhost:8080/#) in save_pretrained(self, save_directory, saved_model, version, push_to_hub, **kwargs)
   1375         if saved_model:
   1376             saved_model_dir = os.path.join(save_directory, "saved_model", str(version))
-> 1377             self.save(saved_model_dir, include_optimizer=False, signatures=self.serving)
   1378             logger.info(f"Saved model created in {saved_model_dir}")
   1379 

[/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py](https://localhost:8080/#) in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

[/usr/lib/python3.7/contextlib.py](https://localhost:8080/#) in __exit__(self, type, value, traceback)
    117         if type is None:
    118             try:
--> 119                 next(self.gen)
    120             except StopIteration:
    121                 return False

[/usr/local/lib/python3.7/dist-packages/transformers/models/deberta_v2/modeling_tf_deberta_v2.py](https://localhost:8080/#) in call(self, inputs, training)
    141 
    142     def call(self, inputs: tf.Tensor, training: tf.Tensor = False):
--> 143         if training and self.drop_prob > 0:
    144             return TFDebertaV2XDropout(inputs, self.drop_prob)
    145         return inputs

OperatorNotAllowedInGraphError: using a `tf.Tensor` as a Python `bool` is not allowed: AutoGraph did convert this function. This might indicate you are trying to use an unsupported feature.

Expected behavior

It is expected to save TFDebertaV2ForSequenceClassification models as TensorFlow SavedModel similar to TFDebertaV2Model models

Rocketknight1 commented 2 years ago

I've reproduced this issue - will discuss with the team what we can do to generally support SavedModel saving.

Rocketknight1 commented 2 years ago

Hi @maziyarpanahi ! I've talked this over with the team and although we offer SavedModel support for saving, it doesn't work with all models and we're not sure how possible it'll be to update all of them in the near future.

Can we ask what your use case for SavedModel is, compared to just save_pretrained or save_weights? There may be another approach.

maziyarpanahi commented 2 years ago

Hi @Rocketknight1

The use case is to serve the fine-tuned (or already uploaded model) in TensorFlow. The SavedModel format is the only way to avoid going from PyTorch to onnx-tf and then to TensorFlow.

There are some architectures that don't have any TF support which I understand and normally either wait or go through ONNX to TF. However, DebertaV2 supports saved_model for the fill-mask and ForTokenClassification already. So I really thought this could be a bug if it only fails in DebertaV2ForSequenceClassification.

Rocketknight1 commented 2 years ago

After some investigation, the cause is the different Dropout being used. In the TokenClassification model, standard Keras Dropout is used. In the SequenceClassification model, StableDropout is used. This change is present in the original PyTorch models too, although I'm not sure why.

I don't think this is a bug with an easy fix, unfortunately - I'm not the model author so I don't want to change the Dropout type. However, you could probably make a local fork of transformers and swap the StableDropout for Dropout, which would allow you to save the model as SavedModel. I'll talk to the other team members and see what they think!

maziyarpanahi commented 2 years ago

Thanks @Rocketknight1

This is a great help! I will make that change and try to fine-tune a base model on IMDB to see whether I can save it as a SavedModel and also share the stats just in case for quality control.

maziyarpanahi commented 2 years ago

Hi @Rocketknight1

For future discussions, I have replaced StableDropout with Dropout, the issue was resolved in saving as SavedModel. Also, the eval from 3-4 trained models on IMDB showed me no difference between StableDropout and Dropout. So there are no tradeoffs when it comes to performance.

I can prepare a PR if you have decided to use Keras Dropout inside TFDebertaV2ForSequenceClassification.

github-actions[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

huggingface / transformers