ymy-k / Hi-SAM

[arXiv preprint] Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation
Apache License 2.0
193 stars 10 forks source link

Poor textline detection quality. #7

Closed decadance-dance closed 3 months ago

decadance-dance commented 6 months ago

Hi, first of all, nice job. I found that quality of the textline detection model is poor. To be more precise, many lines are just not segmented.

To reproduce: python demo_text_detection.py --checkpoint pretrained_checkpoint/line_detection_ctw1500.pth --model-type vit_h --input demo/1.jpg --output demo/ --dataset ctw1500

Images: 1 2

Results: 1 2

ymy-k commented 6 months ago

test If you use Hi-SAM-L trained on HierText and achieve the text-line level results in AMG mode, you will get this. You used a wrong model which is trained on CTW1500. There is a massive gap between CTW1500 and your provided image.

decadance-dance commented 6 months ago

Hi @ymy-k , thanks for the explanation. I confused how to run it properly, though. Now I run: python demo_text_detection.py --checkpoint pretrained_checkpoint/hi_sam_l.pth --model-type vit_l --input demo/1.jpg --output demo/ --dataset ctw1500 Result: 1

Which is better than previous one but not good as yours.

I noticed that you mentioned the AMG mode, but I don't get how to run it. It seems that the demo_amg.py script is not completed.

What a command did you ran to get your result?

ymy-k commented 5 months ago

I use demo_amg.py. If you want to use demo_text_detection.py, try to set fg_points_num = 1500, score_thresh = 0.5 as in demo_amg.py.