ufal / nametag

NameTag: Named Entity Tagger
Mozilla Public License 2.0
38 stars 10 forks source link

NameTag2 returns code 400 + internal error for specific sentences [nametag2] #20

Closed PROrock closed 2 years ago

PROrock commented 2 years ago

For sentence Kdy slaví svátek Oto NameTag2 returns status code 400 and response: An internal error occurred during processing. This happens on localhost for all output types. Based on the stacktrace, the error occures because wembeddings throws also an error (see below).

Curl to test it curl --location --request GET 'localhost:8001/recognize?data=Kdy slaví svátek Oto&output=vertical'

On the http://lindat.mff.cuni.cz/services/nametag/ it behaves a bit differently, as the sentence Kdy slaví svátek Oto works (=returns some result) for all output modes except vertical. Screenshot 2022-03-01 at 14 49 37

NameTag2 log:

2022-03-01T14:08:53Z Traceback (most recent call last):
2022-03-01T14:08:53Z   File "nametag2_server.py", line 521, in do_GET
2022-03-01T14:08:53Z     output = model.predict(output)
2022-03-01T14:08:53Z   File "nametag2_server.py", line 174, in predict
2022-03-01T14:08:53Z     self.network.predict("test", dataset, self.args, output, evaluating=False)
2022-03-01T14:08:53Z   File "/srv/nametag/nametag2_network.py", line 387, in predict
2022-03-01T14:08:53Z     batch_dict = dataset.next_batch(args.batch_size, including_charseqs=args.including_charseqs, seq2seq=seq2seq)
2022-03-01T14:08:53Z   File "/srv/nametag/nametag2_dataset.py", line 223, in next_batch
2022-03-01T14:08:53Z     return self._next_batch(batch_perm, including_charseqs, seq2seq)
2022-03-01T14:08:53Z   File "/srv/nametag/nametag2_dataset.py", line 304, in _next_batch
2022-03-01T14:08:53Z     for i, embeddings in enumerate(self._bert.compute_embeddings("bert-base-multilingual-uncased-last4", batch_sentences)):
2022-03-01T14:08:53Z   File "/srv/nametag/wembedding_service/wembeddings/wembeddings.py", line 168, in compute_embeddings
2022-03-01T14:08:53Z     data=json.dumps({"model": model, "sentences": sentences}, ensure_ascii=True).encode("ascii"),
2022-03-01T14:08:53Z   File "/usr/lib/python3.5/urllib/request.py", line 163, in urlopen
2022-03-01T14:08:53Z     return opener.open(url, data, timeout)
2022-03-01T14:08:53Z   File "/usr/lib/python3.5/urllib/request.py", line 472, in open
2022-03-01T14:08:53Z     response = meth(req, response)
2022-03-01T14:08:53Z   File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
2022-03-01T14:08:53Z     'http', request, response, code, msg, hdrs)
2022-03-01T14:08:53Z   File "/usr/lib/python3.5/urllib/request.py", line 510, in error
2022-03-01T14:08:53Z     return self._call_chain(*args)
2022-03-01T14:08:53Z   File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
2022-03-01T14:08:53Z     result = func(*args)
2022-03-01T14:08:53Z   File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
2022-03-01T14:08:53Z     raise HTTPError(req.full_url, code, msg, hdrs, fp)
2022-03-01T14:08:53Z urllib.error.HTTPError: HTTP Error 400: Bad Request
2022-03-01T14:08:53Z 10.88.0.9 - - [01/Mar/2022 14:08:53] "GET /recognize?data=Kdy%20slav%C3%AD%20sv%C3%A1tek%20Oto&output=vertical HTTP/1.1" 400 -

Wembeddings error:

2022-03-01T14:08:53Z Traceback (most recent call last):
2022-03-01T14:08:53Z   File "/srv/wembeddings/wembeddings/wembeddings_server.py", line 67, in do_POST
2022-03-01T14:08:53Z     sentences_embeddings = request.server._wembeddings.compute_embeddings(model, sentences)
2022-03-01T14:08:53Z   File "/srv/wembeddings/wembeddings/wembeddings.py", line 143, in compute_embeddings
2022-03-01T14:08:53Z     embeddings_with_parts = model.compute_embeddings(np_subwords, np_segments).numpy()
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1655, in __call__
2022-03-01T14:08:53Z     return self._call_impl(args, kwargs)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1665, in _call_impl
2022-03-01T14:08:53Z     cancellation_manager)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1745, in _call_with_structured_signature
2022-03-01T14:08:53Z     return self._filtered_call(args, kwargs, cancellation_manager)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
2022-03-01T14:08:53Z     cancellation_manager=cancellation_manager)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
2022-03-01T14:08:53Z     ctx, args, cancellation_manager=cancellation_manager))
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 550, in call
2022-03-01T14:08:53Z     ctx=ctx)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
2022-03-01T14:08:53Z     inputs, attrs, num_outputs)
2022-03-01T14:08:53Z tensorflow.python.framework.errors_impl.InvalidArgumentError:  indices[1,3] = -1 is not in [0, 105879)
2022-03-01T14:08:53Z     [[node tf_bert_model/bert/embeddings/Gather (defined at usr/local/lib/python3.6/dist-packages/transformers/models/bert/modeling_tf_bert.py:190) ]] [Op:__inference_compute_embeddings_7904]
2022-03-01T14:08:53Z 
2022-03-01T14:08:53Z Errors may have originated from an input operation.
2022-03-01T14:08:53Z Input Source operations connected to node tf_bert_model/bert/embeddings/Gather:
2022-03-01T14:08:53Z  subwords (defined at srv/wembeddings/wembeddings/wembeddings.py:68)
2022-03-01T14:08:53Z 
2022-03-01T14:08:53Z Function call stack:
2022-03-01T14:08:53Z compute_embeddings
2022-03-01T14:08:53Z 

Wembeddings full log (with errors from TensorFlow at the beginning):

2022-03-01T14:07:49Z OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
2022-03-01T14:07:50Z 2022-03-01 14:07:50.930771: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2022-03-01T14:07:50Z 2022-03-01 14:07:50.930932: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
2022-03-01T14:07:53Z Starting WEmbeddings server on port 8000.
2022-03-01T14:07:53Z To stop it gracefully, either send SIGINT (Ctrl+C) or SIGUSR1.
2022-03-01T14:08:29Z 2022-03-01 14:08:29.779715: W tensorflow/stream_executor/platform/default/dso_loader.cc:59] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-03-01T14:08:29Z 2022-03-01 14:08:29.779901: W tensorflow/stream_executor/cuda/cuda_driver.cc:312] failed call to cuInit: UNKNOWN ERROR (303)
2022-03-01T14:08:29Z 2022-03-01 14:08:29.780000: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (4e281229591c): /proc/driver/nvidia/version does not exist
2022-03-01T14:08:29Z 2022-03-01 14:08:29.780756: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
2022-03-01T14:08:29Z To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-01T14:08:29Z 2022-03-01 14:08:29.794271: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2591965000 Hz
2022-03-01T14:08:29Z 2022-03-01 14:08:29.795785: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fa6bc155490 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-03-01T14:08:29Z 2022-03-01 14:08:29.796253: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-03-01T14:08:42Z Some layers from the model checkpoint at bert-base-multilingual-uncased were not used when initializing TFBertModel: ['mlm___cls', 'nsp___cls']
2022-03-01T14:08:42Z - This IS expected if you are initializing TFBertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
2022-03-01T14:08:42Z - This IS NOT expected if you are initializing TFBertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
2022-03-01T14:08:42Z All the layers of TFBertModel were initialized from the model checkpoint at bert-base-multilingual-uncased.
2022-03-01T14:08:42Z If your task is similar to the task the model of the checkpoint was trained on, you can already use TFBertModel for predictions without further training.
2022-03-01T14:08:52Z WARNING:tensorflow:From /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/deprecation.py:574: calling map_fn_v2 (from tensorflow.python.ops.map_fn) with dtype is deprecated and will be removed in a future version.
2022-03-01T14:08:52Z Instructions for updating:
2022-03-01T14:08:52Z Use fn_output_signature instead
2022-03-01T14:08:53Z Traceback (most recent call last):
2022-03-01T14:08:53Z   File "/srv/wembeddings/wembeddings/wembeddings_server.py", line 67, in do_POST
2022-03-01T14:08:53Z     sentences_embeddings = request.server._wembeddings.compute_embeddings(model, sentences)
2022-03-01T14:08:53Z   File "/srv/wembeddings/wembeddings/wembeddings.py", line 143, in compute_embeddings
2022-03-01T14:08:53Z     embeddings_with_parts = model.compute_embeddings(np_subwords, np_segments).numpy()
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1655, in __call__
2022-03-01T14:08:53Z     return self._call_impl(args, kwargs)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1665, in _call_impl
2022-03-01T14:08:53Z     cancellation_manager)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1745, in _call_with_structured_signature
2022-03-01T14:08:53Z     return self._filtered_call(args, kwargs, cancellation_manager)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1848, in _filtered_call
2022-03-01T14:08:53Z     cancellation_manager=cancellation_manager)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 1924, in _call_flat
2022-03-01T14:08:53Z     ctx, args, cancellation_manager=cancellation_manager))
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/function.py", line 550, in call
2022-03-01T14:08:53Z     ctx=ctx)
2022-03-01T14:08:53Z   File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/execute.py", line 60, in quick_execute
2022-03-01T14:08:53Z     inputs, attrs, num_outputs)
2022-03-01T14:08:53Z tensorflow.python.framework.errors_impl.InvalidArgumentError:  indices[1,3] = -1 is not in [0, 105879)
2022-03-01T14:08:53Z     [[node tf_bert_model/bert/embeddings/Gather (defined at usr/local/lib/python3.6/dist-packages/transformers/models/bert/modeling_tf_bert.py:190) ]] [Op:__inference_compute_embeddings_7904]
2022-03-01T14:08:53Z 
2022-03-01T14:08:53Z Errors may have originated from an input operation.
2022-03-01T14:08:53Z Input Source operations connected to node tf_bert_model/bert/embeddings/Gather:
2022-03-01T14:08:53Z  subwords (defined at srv/wembeddings/wembeddings/wembeddings.py:68)
2022-03-01T14:08:53Z 
2022-03-01T14:08:53Z Function call stack:
2022-03-01T14:08:53Z compute_embeddings
2022-03-01T14:08:53Z 
2022-03-01T14:08:53Z 10.88.0.10 - - [01/Mar/2022 14:08:53] "POST /wembeddings HTTP/1.1" 400 -

My captured payload sent to wembeddings (before encoding to ascii bytes): {"model": "bert-base-multilingual-uncased-last4", "sentences": [["Kdy", "slav\u00ed", "sv\u00e1tek"], ["Oto"]]}

My environment for NameTag2: OS: MacOS 11.6.4 Container engine: Podman 3.4.4 using Dockerfile in branch nametag2 https://github.com/ufal/nametag/blob/nametag2/Dockerfile Python version in the image: Python 3.5.2 (affected probably 3.5 and lower)

My environment for Wembeddings: OS: MacOS 11.6.4 Container engine: Podman 3.4.4 using Dockerfile on the master branch https://github.com/ufal/wembedding_service/blob/master/Dockerfile

I discovered two more sentences which fails in the same way with the same error, but on the http://lindat.mff.cuni.cz/services/nametag/ they work normally, not sure why. The sentences are:

Note: I'm a Czech and we can continue in Czech if you would prefer it that way :-)

foxik commented 2 years ago

Hi,

foxik commented 2 years ago

Hi,

the problem has been fixed in https://github.com/ufal/wembedding_service/commit/e55a6d9d75deb17d9433e92ad2adf0c68c71e2c5.

Cheers!

PROrock commented 2 years ago

Thanks for the explanation and mainly for the fix! 👍