Is there an existing issue for this?

[X] I have searched the existing issues and did not find a match.

Who can help?

No response

What are you working on?

I am attempting to fine-tune a BERT model with the Transformers library from HuggingFace, then importing that into SparkNLP with the loadSavedModel function in BertSentenceEmbeddings

To fine-tune BERT for language modeling (as a Fill Mask task), I am following the instructions provided by HuggingFace in this notebook.

To import BERT models from HuggingFace for sentence embeddings in SparkNLP, I am following the instructions provided by John Snow Labs in this notebook.

Current Behavior

Currently, this isn't working as expected and BertSentenceEmbeddings.loadSavedModel is unable to import the fine-tuned BERT model - for example, when initialized with TFAutoModelForMaskedLM.from_pretrained('bert-base-cased'). I receive the following error when trying to do so:

IllegalArgumentException: No Operation named [missing_pooled_output_key] in the Graph

However, I do have any issues using BertSentenceEmbeddings.loadSavedModel to import models from HuggingFace that were not specifically fine-tuned on custom data - for example, when using TFBertModel.from_pretrained('bert-base-cased') without doing any fine-tuning.

Expected Behavior

The expect behavior is that BertSentenceEmbeddings.loadSavedModel will import fine-tuned BERT models (Fill Mask category) without returning an error.

Steps To Reproduce

fine-tune BERT using Transformers

primarily adapted from HuggingFace's instructions in this notebook - since JSL instructs that the BERT model must be in a Fill Task category

!pip install -q transformers==4.30.0 tensorflow==2.11.0

MODEL_NAME = 'bert-base-cased'

from datasets import load_dataset

datasets = load_dataset("wikitext", "wikitext-2-raw-v1")

from transformers import AutoTokenizer

# save tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME).save_pretrained('./{}_tokenizer/'.format(MODEL_NAME))

# load tokenizer
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

def tokenize_function(examples):
    return tokenizer(examples["text"])

tokenized_datasets = datasets.map(
    tokenize_function, batched=True, num_proc=4, remove_columns=["text"]
)

# block_size = tokenizer.model_max_length
block_size = 128

def group_texts(examples):
    # Concatenate all texts.
    concatenated_examples = {k: sum(examples[k], []) for k in examples.keys()}
    total_length = len(concatenated_examples[list(examples.keys())[0]])
    # We drop the small remainder, though you could add padding instead if the model supports it
    # In this, as in all things, we advise you to follow your heart
    total_length = (total_length // block_size) * block_size
    # Split by chunks of max_len.
    result = {
        k: [t[i : i + block_size] for i in range(0, total_length, block_size)]
        for k, t in concatenated_examples.items()
    }
    result["labels"] = result["input_ids"].copy()
    return result

lm_datasets = tokenized_datasets.map(
    group_texts,
    batched=True,
    batch_size=1000,
    num_proc=4,
)

from transformers import TFAutoModelForMaskedLM

model = TFAutoModelForMaskedLM.from_pretrained(MODEL_NAME)

from transformers import create_optimizer, AdamWeightDecay
import tensorflow as tf

optimizer = AdamWeightDecay(learning_rate=2e-5, weight_decay_rate=0.01)

model.compile(optimizer=optimizer, 
              jit_compile=True,
              metrics=["accuracy"]
             )

from transformers import DataCollatorForLanguageModeling

data_collator = DataCollatorForLanguageModeling(
    tokenizer=tokenizer, mlm_probability=0.15, return_tensors="np"
)

train_set = model.prepare_tf_dataset(
    lm_datasets["train"],
    shuffle=True,
    batch_size=16,
    collate_fn=data_collator,
)

validation_set = model.prepare_tf_dataset(
    lm_datasets["validation"],
    shuffle=False,
    batch_size=16,
    collate_fn=data_collator,
)

# train model, just setting it to epochs = 1 and steps_per_epoch = 10 to expedite training time

model.fit(train_set,
          epochs = 1, 
          steps_per_epoch = 10, 
          )

# add TF Signature

import tensorflow as tf

# Define TF Signature
@tf.function(
  input_signature=[
      {
          "input_ids": tf.TensorSpec((None, None), tf.int32, name="input_ids"),
          "attention_mask": tf.TensorSpec((None, None), tf.int32, name="attention_mask"),
          "token_type_ids": tf.TensorSpec((None, None), tf.int32, name="token_type_ids"),
      }
  ]
)

def serving_fn(input):
    return model(input)

# save model to local directory

model.save_pretrained("./{}".format(MODEL_NAME), 
                      saved_model=True,
                      signatures={"serving_default": serving_fn})

import fine-tuned BERT model into SparkNLP

adapted from JSL's instructions in this notebook.

!pip install sparknlp

from sparknlp.annotator import *

import sparknlp

spark = sparknlp.start()

!cp {MODEL_NAME}_tokenizer/vocab.txt {MODEL_NAME}/saved_model/1/assets

sent_bert = BertSentenceEmbeddings.loadSavedModel(
     '{}/saved_model/1'.format(MODEL_NAME),
     spark
 )\
 .setInputCols("sentence")\
 .setOutputCol("bert_sentence")\
 .setCaseSensitive(True)\
 .setDimension(768)\
 .setStorageRef('sent_bert_base_cased')

---------------------------------------------------------------------------

IllegalArgumentException                  Traceback (most recent call last)

/tmp/ipykernel_9685/280536463.py in <cell line: 1>()
----> 1 sent_bert = BertSentenceEmbeddings.loadSavedModel(
      2      '{}/saved_model/1'.format(MODEL_NAME),
      3      spark
      4  )\
      5  .setInputCols("sentence")\

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/sparknlp/annotator/embeddings/bert_sentence_embeddings.py in loadSavedModel(folder, spark_session)
    197         """
    198         from sparknlp.internal import _BertSentenceLoader
--> 199         jModel = _BertSentenceLoader(folder, spark_session._jsparkSession)._java_obj
    200         return BertSentenceEmbeddings(java_model=jModel)
    201 

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/sparknlp/internal/__init__.py in __init__(self, path, jspark)
     56 class _BertSentenceLoader(ExtendedJavaWrapper):
     57     def __init__(self, path, jspark):
---> 58         super(_BertSentenceLoader, self).__init__(
     59             "com.johnsnowlabs.nlp.embeddings.BertSentenceEmbeddings.loadSavedModel", path, jspark)
     60 

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/sparknlp/internal/extended_java_wrapper.py in __init__(self, java_obj, *args)
     25         super(ExtendedJavaWrapper, self).__init__(java_obj)
     26         self.sc = SparkContext._active_spark_context
---> 27         self._java_obj = self.new_java_obj(java_obj, *args)
     28         self.java_obj = self._java_obj
     29 

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/sparknlp/internal/extended_java_wrapper.py in new_java_obj(self, java_class, *args)
     35 
     36     def new_java_obj(self, java_class, *args):
---> 37         return self._new_java_obj(java_class, *args)
     38 
     39     def new_java_array(self, pylist, java_class):

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/pyspark/ml/wrapper.py in _new_java_obj(java_class, *args)
     84             java_obj = getattr(java_obj, name)
     85         java_args = [_py2java(sc, arg) for arg in args]
---> 86         return java_obj(*java_args)
     87 
     88     @staticmethod

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/py4j/java_gateway.py in __call__(self, *args)
   1319 
   1320         answer = self.gateway_client.send_command(command)
-> 1321         return_value = get_return_value(
   1322             answer, self.gateway_client, self.target_id, self.name)
   1323 

~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/pyspark/sql/utils.py in deco(*a, **kw)
    194                 # Hide where the exception came from that shows a non-Pythonic
    195                 # JVM exception message.
--> 196                 raise converted from None
    197             else:
    198                 raise

IllegalArgumentException: No Operation named [missing_pooled_output_key] in the Graph

Spark NLP version and Apache Spark

spark == 3.3.0 sparknlp == 4.4.4

Type of Spark Application

Python Application

Java Version

No response

Java Home Directory

No response

Setup and installation

AWS SageMaker and Databricks

Operating System and Version

No response

Link to your project (if available)

No response

Additional Information

No response

Hi @nreamaroon

Could you please show the output saved_model_cli show --all --dir ...? The BertSentenceEmbeddings requires the output layer of pooled_output which seems to be missing from your fine-tuned BERT model.

Since you fined-tune, this probably was rename/changed to something else, if you can have the output in signature set to pooled_output or somewhere in your model making sure the final output is pooled_output you can easily import it into Spark NLP. (everything else is perfectly fine)

Hi @maziyarpanahi - thanks for following up!

Here's the output of saved_model_cli show --all --dir

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['attention_mask'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: serving_default_attention_mask:0
    inputs['input_ids'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: serving_default_input_ids:0
    inputs['token_type_ids'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: serving_default_token_type_ids:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['logits'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 28996)
        name: StatefulPartitionedCall:0
  Method name is: tensorflow/serving/predict
The MetaGraph with tag set ['serve'] contains the following ops: {'Less', 'StaticRegexFullMatch', 'Transpose', 'Assert', 'All', 'VarHandleOp', 'BiasAdd', 'Max', 'Range', 'RestoreV2', 'Sub', 'BatchMatMulV2', 'Rsqrt', 'Identity', 'StatefulPartitionedCall', 'ShardedFilename', 'AssignVariableOp', 'SquaredDifference', 'Prod', 'StridedSlice', 'ExpandDims', 'NoOp', 'Select', 'StopGradient', 'Const', 'SaveV2', 'Cast', 'ResourceGather', 'ConcatV2', 'MergeV2Checkpoints', 'AddV2', 'ReadVariableOp', 'Mul', 'Placeholder', 'Shape', 'StringJoin', 'MatMul', 'Erf', 'Reshape', 'Softmax', 'Pack', 'GatherV2', 'Mean', 'RealDiv'}

Concrete Functions:
  Function Name: '__call__'
    Option #1
      Callable with:
        Argument #1
          DType: dict
          Value: {'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: False
    Option #2
      Callable with:
        Argument #1
          DType: dict
          Value: {'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_attention_mask'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_token_type_ids'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: False
    Option #3
      Callable with:
        Argument #1
          DType: dict
          Value: {'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: True
    Option #4
      Callable with:
        Argument #1
          DType: dict
          Value: {'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_attention_mask'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_token_type_ids'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: True

  Function Name: '_default_save_signature'
    Option #1
      Callable with:
        Argument #1
          DType: dict
          Value: {'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids')}

  Function Name: 'call_and_return_all_conditional_losses'
    Option #1
      Callable with:
        Argument #1
          DType: dict
          Value: {'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_attention_mask'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_token_type_ids')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: True
    Option #2
      Callable with:
        Argument #1
          DType: dict
          Value: {'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_token_type_ids'), 'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_attention_mask')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: False
    Option #3
      Callable with:
        Argument #1
          DType: dict
          Value: {'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids'), 'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: True
    Option #4
      Callable with:
        Argument #1
          DType: dict
          Value: {'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids'), 'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids_input_ids')}
        Argument #2
          DType: NoneType
          Value: None
        Argument #3
          DType: NoneType
          Value: None
        Argument #4
          DType: NoneType
          Value: None
        Argument #5
          DType: NoneType
          Value: None
        Argument #6
          DType: NoneType
          Value: None
        Argument #7
          DType: NoneType
          Value: None
        Argument #8
          DType: NoneType
          Value: None
        Argument #9
          DType: NoneType
          Value: None
        Argument #10
          DType: NoneType
          Value: None
        Argument #11
          DType: bool
          Value: False

  Function Name: 'serving'
    Option #1
      Callable with:
        Argument #1
          DType: dict
          Value: {'input_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_ids'), 'attention_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='attention_mask'), 'token_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='token_type_ids')}

Thanks for sharing this. I can see that the output is logits which will be representations of scores for 28996 different labels/class/outcome:

outputs['logits'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 28996)
        name: StatefulPartitionedCall:0

It seems the model was fined-tuned the BERT for a classification task rather than CLM/MLM, this is strange. I have also followed the same notebook and saved the model to use in Spark NLP. (however, I used MLM so it is more appropriate to use the model for BertEmbeddings/WordEmbeddings)

This is how I used to save the fine-tuned model:

from transformers import TFBertModel
import tensorflow as tf

# save the fine-tuned model 
model.save_pretrained(output_dir)

# making sure I load the saved  fine-tuned Torch/TF model with TFBertModel
loaded_model = TFBertModel.from_pretrained(output_dir, from_pt=True)
# save it again - here is where you can use the signature  to control the dtype
# but for now you can just re-save it this way and use `saved_model_cli` to see the output
loaded_model.save_pretrained(tf_saved_model, saved_model=True)

Thanks for your response.

When I tried your suggestion (saving the fine-tuned BERT model initialized by TFAutoModelForMaskedLM, loading it in with TFBertModel, saving it again, and loading it in once more with BertSentenceEmbeddings) - I then get a new error:

IllegalArgumentException: Expects arg[0] to be int64 but int32 is provided

Is there a way to resolve this?

Also, can you provide complete code or a notebook to replicate your workflow above so I can get a working example? Specifically speaking, fine-tuning a model initialized by TFBertModel on an MLM task?

I've only found guidance on fine-tuning for an MLM task using AutoModelForMaskedLM (reference) or TFAutoModelForMaskedLM (reference). If fine-tuning with a model initialized by TFBertModel is preferred, could you please provide some guidance on this because I can't seem to get this working on my end.

You are welcome. My notebook is an exact duplicate of the notebook you provided, I just wanted to replicate the work and make sure it can be imported into Spark NLP.

Your new error suggests you didn't incorporate this part you had the first time:

# add TF Signature

import tensorflow as tf

# Define TF Signature
@tf.function(
  input_signature=[
      {
          "input_ids": tf.TensorSpec((None, None), tf.int32, name="input_ids"),
          "attention_mask": tf.TensorSpec((None, None), tf.int32, name="attention_mask"),
          "token_type_ids": tf.TensorSpec((None, None), tf.int32, name="token_type_ids"),
      }
  ]
)

def serving_fn(input):
    return model(input)

Not the first time, but the very last time you are saving the model, you should use the tf function to make sure the types are int32.

# save it again - here is where you can use the signature  to control the dtype
# but for now you can just re-save it this way and use `saved_model_cli` to see the output
loaded_model.save_pretrained(tf_saved_model, saved_model=True)

Just out of curiosity, do you check the saved_model_cli on the last saved model loaded_model.save_pretrained to see if the output has pooled_output and not logits? (if it has, then adding the tf.function should be enough and everything should work)

Thanks for the clarification.

There are a few issues on my end now after trying this again.

First, loading the model (after the initial save) with:

loaded_model = TFBertModel.from_pretrained('models/{}/'.format(MODEL_NAME), 
                                           from_pt=True
                                          )

If from_pt=True, then the model fails to load and I get this error:

File ~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/torch/serialization.py:815, in load(f, map_location, pickle_module, weights_only, **pickle_load_args)
    813     except RuntimeError as e:
    814         raise pickle.UnpicklingError(UNSAFE_MESSAGE + str(e)) from None
--> 815 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)

File ~/anaconda3/envs/tensorflow2_p310/lib/python3.10/site-packages/torch/serialization.py:1033, in _legacy_load(f, map_location, pickle_module, **pickle_load_args)
   1027 if not hasattr(f, 'readinto') and (3, 8, 0) <= sys.version_info < (3, 8, 2):
   1028     raise RuntimeError(
   1029         "torch.load does not work with file-like objects that do not implement readinto on Python 3.8.0 and 3.8.1. "
   1030         f"Received object of type \"{type(f)}\". Please update to Python 3.8.2 or newer to restore this "
   1031         "functionality.")
-> 1033 magic_number = pickle_module.load(f, **pickle_load_args)
   1034 if magic_number != MAGIC_NUMBER:
   1035     raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, 'H'.

I can only load the model without from_pt=True - I'm looking into why I get this error. My Python version is 3.10.10, so I've just commented out this part for now.

Next issue is with saving the model a second time.

loaded_model.save_pretrained('models/{}_tf/'.format(MODEL_NAME), 
                             saved_model=True,
                             signatures={"serving_default": serving_fn}
                            )

Which yields:

AssertionError: Tried to export a function which references an 'untracked' resource. TensorFlow objects (e.g. tf.Variable) captured by functions must be 'tracked' by assigning them to an attribute of a tracked object or assigned to an attribute of the main object directly. See the information below:
    Function name = b'__inference_signature_wrapper_115202'
    Captured Tensor = <ResourceHandle(name="Resource-3-at-0x5636dece0c70", device="/job:localhost/replica:0/task:0/device:CPU:0", container="Anonymous", type="tensorflow::Var", dtype and shapes : "[ DType enum: 1, Shape: [28996,768] ]")>
    Trackable referencing this tensor = <tf.Variable 'tf_bert_for_masked_lm/bert/embeddings/word_embeddings/weight:0' shape=(28996, 768) dtype=float32>
    Internal Tensor = Tensor("114796:0", shape=(), dtype=resource)

If I exclude signatures={"serving_default": serving_fn}, then the model saves fine without issues. But as you mentioned, without the TF signature, the types are int64. Here is the saved_model_cli from that model without the TF signature, which does show outputs['pooler_output'] tensor_info::

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

signature_def['int64_serving']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['attention_mask'] tensor_info:
        dtype: DT_INT64
        shape: (-1, -1)
        name: int64_serving_attention_mask:0
    inputs['input_ids'] tensor_info:
        dtype: DT_INT64
        shape: (-1, -1)
        name: int64_serving_input_ids:0
    inputs['token_type_ids'] tensor_info:
        dtype: DT_INT64
        shape: (-1, -1)
        name: int64_serving_token_type_ids:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['last_hidden_state'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 768)
        name: StatefulPartitionedCall:0
    outputs['pooler_output'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 768)
        name: StatefulPartitionedCall:1
  Method name is: tensorflow/serving/predict

signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['attention_mask'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: serving_default_attention_mask:0
    inputs['input_ids'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: serving_default_input_ids:0
    inputs['token_type_ids'] tensor_info:
        dtype: DT_INT32
        shape: (-1, -1)
        name: serving_default_token_type_ids:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['last_hidden_state'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, -1, 768)
        name: StatefulPartitionedCall_1:0
    outputs['pooler_output'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 768)
        name: StatefulPartitionedCall_1:1
  Method name is: tensorflow/serving/predict
2023-06-21 19:53:57.377820: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:267] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected

Hi all, I prepared a notebook that shows porting original models and fine-tuned models to Spark NLP please check https://colab.research.google.com/drive/1odpDx9uOYrDb4ugShIeoqQIqSzYaPcoC?usp=sharing

This issue is stale because it has been open 180 days with no activity. Remove stale label or comment or this will be closed in 5 days

JohnSnowLabs / spark-nlp

Can't Import Fine-Tuned BERT Sentence Embeddings Model #13860

Is there an existing issue for this?

Who can help?

What are you working on?

Current Behavior

Expected Behavior

Steps To Reproduce

fine-tune BERT using Transformers

primarily adapted from HuggingFace's instructions in this notebook - since JSL instructs that the BERT model must be in a Fill Task category

import fine-tuned BERT model into SparkNLP

adapted from JSL's instructions in this notebook.

Spark NLP version and Apache Spark

Type of Spark Application

Java Version

Java Home Directory

Setup and installation

Operating System and Version

Link to your project (if available)

Additional Information