Open loseevaya opened 1 year ago
Nice work! When selecting tokens from the encoder output, the output dimension of the class_embedding is 91, which includes the category of "no object". Will the tokens selected in this way have an impact on the results?
We use focal loss, where no "no object" token exists. Or you can view it as multiple binary classifications.
Nice work! When selecting tokens from the encoder output, the output dimension of the class_embedding is 91, which includes the category of "no object". Will the tokens selected in this way have an impact on the results?