Closed tralfamadude closed 3 years ago
The core of dhSegment is to provide the way to perform the pixel-wise class predictions.
What can be done with these predictions is almost always task dependent. While we provide some usual functions to perform that step (https://dhsegment.readthedocs.io/en/latest/reference/post_processing.html), you most likely will have to adapt based on what you want.
For instance generating non-overlapping rectangles based on the prediction map is a problem that does not have a clear solution.
As far as confidence is concerned, each pixel has a probability value, so you can leverage that.
Thanks, Benoit.
I improved the classification by working with simpler labeled regions. Previously I labeled "beginning of article" but this turns out to be complex and noisy because I also had simpler labeled areas that overlapped visually. Using only simple and visually distinct labeled annotations like {title, author} and post-processing that requires both to be present on one page, I have something that could work. If the content is all in the same font/font-size, this might not work, but I don't see that (at least not yet).
Glad to hear you manage to improve your pipeline.
I am attempting CLASSIFICATION now, not MULTILABEL (issue https://github.com/dhlab-epfl/dhSegment/issues/29 was helpful in mentioning that mutually-exclusive areas mean classification, not multilabel. This is clear in retrospect ;^)
Now I need to extract rectangles and I have hit a big gap in dhSegment. The demo.py code shows how to generate the rectangle corresponding to a skewed page, but there is only one class. I modified demo.py to identify rectangles for each label. When there are multiple classes, there can be spurious, overlapping rectangles.
How can I:
The end result I want is one or more jpegs associated with a particular class label plus the coordinates within the input image.
Perhaps the labels plane in the prediction result offers some help here? demo.py does not use the labels plane.