shenyunhang / APE

[CVPR 2024] Aligning and Prompting Everything All at Once for Universal Visual Perception
https://arxiv.org/abs/2312.02153
Apache License 2.0
459 stars 28 forks source link

online demo correct? #17

Open AllenDun opened 6 months ago

AllenDun commented 6 months ago

the performance of online demo seems not good (just pick a normal image from network), is three something wrong?

shenyunhang commented 6 months ago

The huggingface demo has a slightly different implementation, where the MultiScaleDeformableAttention operation is the pytorch version in here. The inference could have a big discrepancy in some images. You may need to adjust the Score Threshold to get better output.