Open nhoffman opened 2 months ago
This is a really interesting edge case. I think the challenge is the "mostly regular text with some inverted". Some ideas:
Thanks a lot for the suggestions - I'd love to give the fine tuning approach a shot, but I'm not sure where to start. I know it's a big topic, but can you suggest a) a general resource describing how I would go about fine tuning the text detection model (eg, an overview of the process, how many examples you think might be sufficient, would I provide examples cropped to the white on black text vs providing examples in context); b) in the context of this project, where is the model specified (I assume it downloads a model from huggingface, but I can't seem to find where this configuration is located), and how would I update the the configuration to refer to the fine-tuned model. I'd certainly be happy to document the process for anyone else with a need for something similar.
Thanks a lot for any help!
Hi there - I am looking into parsing laboratory test results (unfortunately results are often received as pdfs), and performance seems to be great except in a very specific context: a report that I'm looking at contains a critical element with white text on a black background. In this case the text is either not detected or read incorrectly. I'm a bit limited in what I can share so this is lacking context, but for example, failure to detect text:
Incorrect results:
Any suggestions on settings or pre-processing strategies that might help?
Thanks a lot!