Issue with Partial Detection of Pages by Nougat OCR

Some pages are not being fully detected by the Nougat OCR model. In many cases, only half of the content on a page is detected, while the rest is skipped. However, for other pages, the detection works perfectly fine.

Steps to Reproduce:

Convert the PDF into images (one image per page).
Process each image using the Nougat OCR model individually.
Observe that some pages are partially detected, while others are processed correctly.

(This is the notebook I'm following for inference )

Example Results:
First Example: For this page:

Answers Snippets to Papers_page-0008

 ```
 ## Answers (LC2020 HL, P2):
 1. \(0\); \(A\), \(B\) and \(C\) are collinear [0, 4, 7, 11, 15]
 2. \(33\cdot 435^{\circ}\)[0, 4, 7, 11, 15]
 3. \(9\)[0, 4, 7, 11, 15]
 4. \(x^{2}+y^{2}+4x-21=0\), \(x^{2}+y^{2}-8x-9=0\)[0, 4, 7, 11, 15]
 5. \(6\cdot 44\) m [0, 4, 7, 11, 15]
 6. \(k=9\)[0, 4, 7, 11, 15]
 7. \(\frac{5\pi}{3}\), \
 ```

Second Example: For this page:
```
 ## Answers (LC 2019 HL, P2):
 1. (i) \(\frac{48}{95}\) [**0, 4, 7, 10**], (ii) \(\frac{88}{969}\) [**0, 4, 5, 8, 10**]
 2. 1400 [**0, 4, 7, 10**]
 3. Show [**0, 4, 7, 10**]
 4. (i) \(mx-y-6m=0\) [**0, 2, 5**], (ii) \(P\bigg{(}\frac{18m+25}{3m+4}\), \(\frac{m}{3m+4}\bigg{)}\) [**0, 4, 7, 11, 15**]
```
Expected Behavior: The OCR model should consistently detect all parts of each page, rather than only detecting part of the content.

Question: Is there any preprocessing that needs to be done to ensure complete page detection? Or are there specific parameters that should be adjusted in Nougat OCR to improve the results?

facebookresearch / nougat

Issue with Partial Detection of Pages by Nougat OCR #244