google-research-datasets / hiertext

The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
Creative Commons Attribution Share Alike 4.0 International
261 stars 23 forks source link

Line based model results with different number of queries #16

Closed Asafgendler closed 1 year ago

Asafgendler commented 1 year ago

Hello,

Can you publish the results of the line based model on the validation set with different number of queries settings? (128, 256, 384)

Jyouhou commented 1 year ago

There you go:

model | line | -- | -- | -- | -- | paragraph| -- | -- | -- | -- -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- -- |P | R | F | T | PQ | P | R | F | T | PQ q=128|0.7523718577 | 0.7221369139 | 0.7369444006 | 0.7910638841 | 0.5829700999 | 0.7214038677 | 0.6085080791 | 0.6601641302 | 0.7906753779 | 0.5219755231 q=256|0.7513597881 | 0.7563951481 | 0.75386906 | 0.788932645 | 0.5947519115 | 0.7338229663 | 0.6083403567 | 0.6652158011 | 0.7879685613 | 0.5241691378 q=384 |0.7832696391 | 0.7685009939 | 0.7758150378 | 0.7952970247 | 0.6170033913 | 0.7679124437 | 0.6089638043 | 0.6792634905 | 0.7963967498 | 0.5409632361
Asafgendler commented 1 year ago

Thanks,

Is it on the validation set? because the results of the 384 queries model are different than those in the sample_eval_scores.txt file

Jyouhou commented 1 year ago

The scores in this repo are from the opensource model.

When we convert the internal code and model to opensource, there's some loss. The figures I just gave you are from our early research.

Asafgendler commented 1 year ago

OK, thanks