neuralmagic / deepsparse

Sparsity-aware deep learning inference runtime for CPUs
https://neuralmagic.com/deepsparse/
Other
2.98k stars 172 forks source link

Broken Transformers QA Inference Pipeline #825

Closed dbogunowicz closed 1 year ago

dbogunowicz commented 1 year ago

Describe the bug

Transformers QA pipeline fails on a simple inference task.

Expected behavior The inference pipeline for Question Answering should work without raising any errors.

Environment Python version: 3.8 DeepSparse version: current main

To Reproduce

from deepsparse import Pipeline

task = "question-answering"
dense_qa_pipeline = Pipeline.create(
        task=task,
        model_path="zoo:nlp/question_answering/distilbert-none/pytorch/huggingface/squad/base-none",
        # or model_path = "zoo:nlp/question_answering/bert-base/pytorch/huggingface/squad/base-none",
        # was checking whether the problem is not model-dependent
    )

question = "DeepSparse is sparsity-aware inference runtime offering GPU-class performance on CPUs and APIs to integrate ML into your application"
q_context = "What is DeepSparse?"

dense_qa_pipeline(question=question, context=q_context)

Errors

None of PyTorch, TensorFlow >= 2.0, or Flax have been found. Models won't be available and only tokenizers, configuration and file/data utilities can be used.
DeepSparse, Copyright 2021-present / Neuralmagic, Inc. version: 1.3.0.20221217 COMMUNITY | (d5bf112b) (release) (optimized) (system=avx2, binary=avx2)
Traceback (most recent call last):
  File "/usr/lib/python3.8/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/home/ubuntu/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home/ubuntu/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/ubuntu/damian/deepsparse_copy/hehe.py", line 14, in <module>
    dense_output = dense_qa_pipeline(question=question, context=q_context)
  File "/home/ubuntu/damian/deepsparse_copy/src/deepsparse/pipeline.py", line 217, in __call__
    engine_inputs: List[numpy.ndarray] = self.process_inputs(pipeline_inputs)
  File "/home/ubuntu/damian/deepsparse_copy/src/deepsparse/transformers/pipelines/question_answering.py", line 261, in process_inputs
    {
  File "/home/ubuntu/damian/deepsparse_copy/src/deepsparse/transformers/pipelines/question_answering.py", line 262, in <dictcomp>
    key: numpy.array(tokenized_example[key][span])
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (128,) + inhomogeneous part.

Additional context The error occurs in https://github.com/neuralmagic/deepsparse/blob/main/src/deepsparse/transformers/pipelines/question_answering.py#L260.

When attempting to perform dictionary comprehension:

{
                    key: numpy.array(tokenized_example[key][span])
                    for key in tokenized_example.keys()
                    if key not in self.onnx_input_names
                }

Here: self.onnx_input_names = ['input_ids', 'attention_mask', 'token_type_ids'] tokenized_example.keys() = ['input_ids', 'token_type_ids', 'attention_mask', 'special_tokens_mask', 'offset_mapping', 'overflow_to_sample_mapping', 'example_id']

As a result, we end up iterating over the list difference. One element of this resulting list, offset_mapping is the culprit:

[tokenized_example[key][0] for key in['offset_mapping']]`

results in :

image

Calling numpy.array(...) on this data structure envokes the error in question.

Interestingly, when @mwitiderrick attempted to reproduce an error inside the collab notebook (not using the main, but the last release), the problem disappears: https://colab.research.google.com/drive/1aIrITYxgcR-5VmL4vm8P-6H4rvCBAeaX?usp=sharing However, it reappears (on the last release) when he attempted to run transformers QA pipeline in HF/Gradio: https://huggingface.co/spaces/neuralmagic/question-answering/blob/main/app.py

jeanniefinks commented 1 year ago

Hi @dbogunowicz Could you let us know if this is still a known issue? Thank you. -Jeannie / Neural Magic

dbogunowicz commented 1 year ago

Hey, @jeanniefinks. Yeah, this issue is still valid. We have managed to suppress it by pinning the numpy version across all the repos. However, once we decide to upgrade numpy, this problem will strike back. There is a PR opened to fix this problem, but sadly I did not get a chance to work on it for a while now: https://github.com/neuralmagic/deepsparse/pull/827

jeanniefinks commented 1 year ago

This has been addressed so I'll be closing this thread. Thank you!