About segmentation outputs

OpenGVLab / VisionLLM

VisionLLM Series

https://arxiv.org/abs/2305.11175

Apache License 2.0

865 stars 22 forks source link

About segmentation outputs #6

Open kahnchana opened 1 year ago

kahnchana commented 1 year ago

Are segmentation outputs (coordinates) directly predicted from network as floating point numbers under next token prediction loss? This part is quite unclear in the paper.

Or are they regressed (using the bin tokens) from anchor points?