Open AllenDun opened 6 months ago
The huggingface demo has a slightly different implementation, where the MultiScaleDeformableAttention operation is the pytorch version in here.
The inference could have a big discrepancy in some images.
You may need to adjust the Score Threshold
to get better output.
the performance of online demo seems not good (just pick a normal image from network), is three something wrong?