Open jihwanp opened 2 years ago
Hi,
Great question! I haven't played a lot with it (I always prefer sigmoid CE in a large vocabulary). In my preliminary runs, Federated loss improves less in softmax CE than sigmoid CE. However, I do hear from my labmate that Federated loss could perform similarly in softmax CE under proper settings. I'll get back to you later.
Best, Xingyi
Hi Is there any ablation that use softmax ce using federeated loss? CenterNet2 and Detic demonstrated that using federated loss is crucial for long tail distributed dataset like LVIS, and this sampling strategy has been used only for sigmoid ce. I found that you've implemented the federated loss for softmax ce version, so I wonder how much gap between sigmoid ce and softmax ce. Thanks