lmmx / page-dewarp

Document image dewarping library using a cubic sheet model
MIT License
109 stars 18 forks source link

Sorting by contour info score #6

Closed Anphisa closed 2 years ago

Anphisa commented 2 years ago

I had a problem when using page-dewarp on an image which had too few spans from contours. When detecting spans from line detection, line 65 in spans.py throws an error: TypeError: '<' not supported between instances of 'ContourInfo' and 'ContourInfo' (Test image used: https://imgur.com/Pa92Hks)

With this fix, the code runs and produces good results, so I think I'm sorting by the score that you refer to in the comment on the same line 65.

lmmx commented 2 years ago

Thank you for your contribution @Anphisa !

lmmx commented 2 years ago

On review, I expect this arises from a tie in the first element which has to be broken by the 2nd element (which is a page_dewarp.contours.ContourInfo object which as you note cannot be compared). Indeed I see this happens

(Pdb) for c in sorted(candidate_edges, key=lambda c:c[0]): print(c[0])
2.0               
2.0               
2.0               
2.0               
2.0               
2.0               
2.0               
2.472606408678394                                                                                                                                                        
2.4746120559791636                                                                                                                                                       
3.0                                                                                                                                                                      
3.0                                                                                 
3.0               
3.0
...   

In future I'd presume it might be worth considering info from those objects to break the tie in a non-arbitrary way, I'll add a note for this but for now going with your approach. (That approach could be done by implementing inequality operators on the ContourInfo class)

lmmx commented 2 years ago

See e502980