JVM crashes with BERT classification example

Description

I am trying to rework the BERT classification example to work with this sentiment model.

I first converted the above model to saved_model format in Python:

from transformers import TFBertForSequenceClassification
model = TFBertForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english", from_pt=False)
model.save_pretrained("distillbert_sentiment", saved_model=True)

Expected Behavior

I am trying to make predictions for a simple input locally on my machine with the above model, but get a fatal error.

Error Message

> Task :examples:BertClassification.main()
Loading:     100% |████████████████████████████████████████|
2021-12-06 13:35:42.662276: I external/org_tensorflow/tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-06 13:35:42.709658: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:32] Reading SavedModel from: /Users/attilanagy/temp/distillbert_sentiment/saved_model/1
2021-12-06 13:35:42.793270: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:55] Reading meta graph with tags { serve }
2021-12-06 13:35:42.793288: I external/org_tensorflow/tensorflow/cc/saved_model/reader.cc:93] Reading SavedModel debug info (if present) from: /Users/attilanagy/temp/distillbert_sentiment/saved_model/1
2021-12-06 13:35:43.120488: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:206] Restoring SavedModel bundle.
2021-12-06 13:35:43.861048: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:190] Running initialization op on SavedModel bundle at path: /Users/attilanagy/temp/distillbert_sentiment/saved_model/1
2021-12-06 13:35:44.162946: I external/org_tensorflow/tensorflow/cc/saved_model/loader.cc:277] SavedModel load for tags { serve }; Status: success: OK. Took 1453281 microseconds.
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000000140ee6959, pid=54046, tid=0x0000000000002903
#
# JRE version: OpenJDK Runtime Environment (8.0_312-b07) (build 1.8.0_312-b07)
# Java VM: OpenJDK 64-Bit Server VM (25.312-b07 mixed mode bsd-amd64 compressed oops)
# Problematic frame:
# C  [libtensorflow_cc.2.dylib+0x8229959]  tensorflow::TF_TensorToTensor(TF_Tensor const*, tensorflow::Tensor*)+0x9
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# An error report file with more information is saved as:
# /Users/attilanagy/Work/Karma/djl/hs_err_pid54046.log
#
# If you would like to submit a bug report, please visit:
#   https://github.com/adoptium/adoptium-support/issues
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

> Task :examples:BertClassification.main() FAILED

How to Reproduce?

Here is modified the BertClassification.java file, that I am trying to run:

/*
 * Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 *
 * Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file except in compliance
 * with the License. A copy of the License is located at
 *
 * http://aws.amazon.com/apache2.0/
 *
 * or in the "license" file accompanying this file. This file is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES
 * OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions
 * and limitations under the License.
 */
package ai.djl.examples.inference;

import ai.djl.MalformedModelException;
import ai.djl.ModelException;
import ai.djl.inference.Predictor;
import ai.djl.modality.Classifications;
import ai.djl.modality.nlp.DefaultVocabulary;
import ai.djl.modality.nlp.bert.BertFullTokenizer;
import ai.djl.ndarray.NDArray;
import ai.djl.ndarray.NDArrays;
import ai.djl.ndarray.NDList;
import ai.djl.ndarray.NDManager;
import ai.djl.repository.zoo.Criteria;
import ai.djl.repository.zoo.ModelNotFoundException;
import ai.djl.repository.zoo.ZooModel;
import ai.djl.training.util.ProgressBar;
import ai.djl.translate.NoBatchifyTranslator;
import ai.djl.translate.TranslateException;
import ai.djl.translate.TranslatorContext;
import java.io.IOException;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.concurrent.ConcurrentHashMap;
import java.util.stream.Collectors;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * The example is targeted to specific use case for BERT classification. TODO make it generic enough
 * for reference.
 */
public final class BertClassification {

    private static final Logger logger = LoggerFactory.getLogger(BertQaInference.class);

    private BertClassification() {}

    public static void main(String[] args) throws IOException, ModelException, TranslateException {
        List<String> inputs = new ArrayList<>();
        inputs.add("NEGATIVE\tI am very sad");
        inputs.add("NEGATIVE\tLife is miserable!");
        inputs.add("POSITIVE\tThis is a happy day.");

        Classifications[] results = predict(inputs);
        for (int i = 0; i < inputs.size(); i++) {
            logger.info("Prediction for: " + inputs.get(i) + "\n" + results[i].toString());
        }
    }

    public static Classifications[] predict(List<String> inputs)
            throws MalformedModelException, ModelNotFoundException, IOException,
                    TranslateException {
        // refer to
        // https://medium.com/delvify/bert-rest-inference-from-the-fine-tuned-model-499997b32851 and
        // https://github.com/google-research/bert
        // for converting public bert checkpoints to saved model format.
        String modelUrl = "path/to/model/saved_model/";
        String vocabularyPath = "path/to/vocab/vocab.txt";

        Criteria<String[], Classifications[]> criteria =
                Criteria.builder()
                        .setTypes(String[].class, Classifications[].class)
                        .optModelUrls(modelUrl)
                        .optTranslator(new MyTranslator(vocabularyPath, 512))
                        .optEngine("TensorFlow")
                        .optProgress(new ProgressBar())
                        .build();

        try (ZooModel<String[], Classifications[]> model = criteria.loadModel();
                Predictor<String[], Classifications[]> predictor = model.newPredictor()) {
            return predictor.predict(inputs.toArray(new String[0]));
        }
    }

    private static final class MyTranslator
            implements NoBatchifyTranslator<String[], Classifications[]> {
        private final List<String> classes =
                Arrays.asList("NEGATIVE", "POSITIVE");
        private BertFullTokenizer tokenizer;
        private final int maxSequenceLength;
        private final String vocabularyPath;

        MyTranslator(String vocabularyPath, int maxSequenceLength) {
            this.maxSequenceLength = maxSequenceLength;
            this.vocabularyPath = vocabularyPath;
        }

        /** {@inheritDoc} */
        @Override
        public void prepare(TranslatorContext ctx) throws IOException {
            DefaultVocabulary vocabulary =
                    DefaultVocabulary.builder()
                            .addFromTextFile(Paths.get(vocabularyPath))
                            .optUnknownToken("[UNK]")
                            .build();
            tokenizer = new BertFullTokenizer(vocabulary, true);
        }

        /** {@inheritDoc} */
        @Override
        public NDList processInput(TranslatorContext ctx, String[] inputs) {
            NDManager inputManager = ctx.getNDManager();
            List<NDList> tokenizedInputs =
                    Arrays.stream(inputs)
                            .map(s -> tokenizeSingleString(inputManager, s))
                            .collect(Collectors.toList());
            NDList inputList = new NDList();
            inputList.add(stackInputs(tokenizedInputs, 0, "input_ids"));
            inputList.add(stackInputs(tokenizedInputs, 1, "input_mask"));
            inputList.add(stackInputs(tokenizedInputs, 2, "segment_ids"));
            inputList.add(stackInputs(tokenizedInputs, 3, "label_ids"));
            return inputList;
        }

        private NDArray stackInputs(List<NDList> tokenizedInputs, int index, String inputName) {
            NDArray stacked =
                    NDArrays.stack(
                            tokenizedInputs
                                    .stream()
                                    .map(list -> list.get(index).expandDims(0))
                                    .collect(Collectors.toCollection(NDList::new)));
            stacked.setName(inputName);
            return stacked;
        }

        private NDList tokenizeSingleString(NDManager manager, String input) {
            String[] inputs = input.split("\t");
            ConcurrentHashMap<String, Long> labelMap = new ConcurrentHashMap<>();
            for (int i = 0; i < classes.size(); i++) {
                labelMap.put(classes.get(i), (long) i);
            }
            List<String> tokensA = tokenizer.tokenize(inputs[1]);
            if (tokensA.size() > maxSequenceLength - 2) {
                tokensA = tokensA.subList(0, maxSequenceLength - 2);
            }

            List<String> tokens = new ArrayList<>();
            List<Long> segmentIds = new ArrayList<>();
            tokens.add("[CLS]");
            segmentIds.add(0L);
            for (String token : tokensA) {
                tokens.add(token);
                segmentIds.add(0L);
            }
            tokens.add("[SEP]");
            segmentIds.add(0L);
            List<Long> inputIds = new ArrayList<>();
            List<Long> inputMask = new ArrayList<>();

            for (String token : tokens) {
                inputIds.add(tokenizer.getVocabulary().getIndex(token));
                inputMask.add(1L);
            }
            while (inputIds.size() < maxSequenceLength) {
                inputIds.add(0L);
                inputMask.add(0L);
                segmentIds.add(0L);
            }
            Long labelId = labelMap.get(inputs[0]);
            NDList outputList = new NDList();
            outputList.add(manager.create(inputIds.stream().mapToLong(l -> l).toArray()));
            outputList.add(manager.create(inputMask.stream().mapToLong(l -> l).toArray()));
            outputList.add(manager.create(segmentIds.stream().mapToLong(l -> l).toArray()));
            outputList.add(manager.create(labelId));
            return outputList;
        }

        /** {@inheritDoc} */
        @Override
        public Classifications[] processOutput(TranslatorContext ctx, NDList list) {
            NDArray batchOutput = list.singletonOrThrow();
            int numOutputs = (int) batchOutput.getShape().get(0);
            Classifications[] output = new Classifications[numOutputs];

            for (int i = 0; i < numOutputs; i++) {
                output[i] = new Classifications(classes, batchOutput.get(i));
            }
            return output;
        }
    }
}

I am new to working with DJL, so any help is very much appreciated.

Thank you.

I looked into this a bit. First, when I ran your python script it returned a number of errors:

2021-12-07 11:43:55.320808: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Some layers from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english were not used when initializing TFBertForSequenceClassification: ['pre_classifier', 'dropout_19', 'distilbert']
- This IS expected if you are initializing TFBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing TFBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some layers of TFBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english and are newly initialized: ['bert']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
2021-12-07 11:44:07.152173: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, encoder_layer_call_fn, encoder_layer_call_and_return_conditional_losses, pooler_layer_call_fn while saving (showing 5 of 1055). These functions will not be directly callable after loading.
/Users/kimbergz/.pyenv/versions/3.9.7/lib/python3.9/site-packages/keras/saving/saved_model/layer_serialization.py:112: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  return generic_utils.serialize_keras_object(obj)

This seems like the script is for Bert instead of DistilBert, so I instead updated the script to:

from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

model.save_pretrained("distillbert_sentiment2", saved_model=True)

This fixed some of the errors, but still has some. Here is the output:

2021-12-07 12:11:13.198856: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-12-07 12:11:13.217828: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
All model checkpoint layers were used when initializing TFDistilBertForSequenceClassification.

All the layers of TFDistilBertForSequenceClassification were initialized from the model checkpoint at distilbert-base-uncased-finetuned-sst-2-english.
If your task is similar to the task the model of the checkpoint was trained on, you can already use TFDistilBertForSequenceClassification for predictions without further training.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x163d4fb20>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x15d6ecdf0>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x15d6ff1c0>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x15d70d550>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x15d7198e0>, because it is not built.
WARNING:tensorflow:Skipping full serialization of Keras layer <keras.layers.core.dropout.Dropout object at 0x15d725c70>, because it is not built.
WARNING:absl:Found untraced functions such as embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, transformer_layer_call_fn, transformer_layer_call_and_return_conditional_losses, add_layer_call_fn while saving (showing 5 of 415). These functions will not be directly callable after loading.
/Users/kimbergz/.pyenv/versions/3.9.7/lib/python3.9/site-packages/keras/saving/saved_model/layer_serialization.py:112: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument.
  return generic_utils.serialize_keras_object(obj)

And, running it in DJL still segfaults. I think this problem might be similar to https://github.com/tensorflow/tensorflow/issues/47554.

In summary, it seems like Tensorflow is causing a number of problems here. I would recommend switching over to PyTorch as a PyTorch version is available for that model. I didn't spend much time getting it to work, but I downloaded it with the following script and it had no warnings or errors (unlike all the warnings from TF):

```python
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

model = TFAutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased-finetuned-sst-2-english")

model.save_pretrained("distillbert_sentiment2", saved_model=True)

deepjavalibrary / djl