Closed fattynoparents closed 7 months ago
Hi, thanks for your interest. The reason you are seeing all 0 is likely due to there being NaN values in the output when using float16
. We chose to use bfloat16
because it has the same range for values as float32
. The current public model was trained using float32
. Where the NaN values come from exactly is not quite clear to me yet, but to help your problem simply disable automatic mixed precision (AMP), by doing either MODEL.AMP_TEST.PRECISION float32
or MODEL.AMP_TEST.ENABLED False
.
This should give you the expected output on the GPU. With a minor performance hit in speed and memory usage (compared to float16). But almost certainly still a massive speedup over running on CPU.
Let me know if this issue persists with float32
, otherwise please change the title so others with the same problem can more easily find this issue. I will also update the docs when I have time
Thanks a lot for the quick reply, setting to float32 has solved the issue.
Hi, when I try to run the very first round of image recognition using test models suggested in the README, it works fine (even if slow) with my CPU. Here's some of the output of the correctly working code:
Textlines to match are greater than 0, which leads to correct segmentation and, consequently, correct work of loghi-htr.
However, when I try to run it using my GPU (Nvidia Geforce GTX 1650 Ti) the Laypa part fails to recognize lines for the segmentation. Here's some of the ouput:
Textlines to match are always 0. The only thing I had to change in the na-pipeline.sh file in the Laypa recognition is this -
MODEL.AMP_TEST.PRECISION float16
because otherwise I got an error that my system doesn't support bfloat16 and a suggestion to switch to float16.Could someone please help me with that? Thanks a lot in advance.