huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.72k stars 26.94k forks source link

Indicies element out of bounds from inclusive range #28411

Closed nogifeet closed 10 months ago

nogifeet commented 10 months ago

System Info

Hello, we are using the TR-OCR model exported to Onnx. We notice a problem with the large checkpoints for both printed and handwritten; when we run inference using the onnxruntime java library.

Dataset: IAM handwritten (Lines)

Different behaviours are observed on CPU and GPU:

CPU: (we might get an error like below)

      Status Message: Non-zero status code returned while running the Gather node. Name:'Gather_346' Status Message: 
      indices element out of data bounds, idx=514 must be within the inclusive range [-514,513]
            at ai.onnxruntime.OrtSession.run(Native Method)
            at ai.onnxruntime.OrtSession.run(OrtSession.java:301)
            at ai.onnxruntime.OrtSession.run(OrtSession.java:242)

GPU: We notice that the end token is not generated and the decoder keeps repeating the tokens after a point.

This is the main problem, usually, the Gather_346 and Gather_320 operators fail and throw data bounds error.

We have also noticed different behaviour when we turn caching on/off. Note we don't face this problem on the base or small checkpoints but only on the "large" checkpoints. Looking to understand whether this is an onnxruntime issue or hf, please let me know.

A similar issue was raised in the onnxruntime page: https://github.com/microsoft/onnxruntime/issues/2080

Who can help?

No response

Information

Tasks

Reproduction

  1. Export the large checkpoints of TR-OCR.
  2. Run a simple example from the IAM dataset image attached.
  3. Use this image e04-083-00
  4. Don't use any max_length limit and you will notice that the end token is not generated and the tokens are repeated.
  5. Current Output: The edges of the transoms should be bevelled to be edges to the edges of the

Expected behavior

Current Output: The edges of the transoms should be bevelled to be edges to the edges of the Expected Output: The edges of the transoms should be bevelled to

ArthurZucker commented 10 months ago

Hey! THanks for opening an issue, but without an isolated reproducible snippet + a traceback that shows this is an issue from transfromers I am not sure how we can help! 🤗

linglongxian commented 2 months ago

@nogifeet hi, did you have solve this problem? i meet similar problem when run trocr_onnx with onnxruntime(c++, cpu), can you share your solution for reference? Thanks.

nogifeet commented 2 months ago

@linglongxian
https://github.com/huggingface/transformers/issues/28550