Closed nkaenzig closed 1 month ago
We've found in experiments that including passing the original image to the decoder can boost performance significantly, as fine-grained information about the object outlines might be lost in the embedding space rendered by vision FMs.
We've found in experiments that including passing the original image to the decoder can boost performance significantly, as fine-grained information about the object outlines might be lost in the embedding space rendered by vision FMs.