MarginLoss different from the paper

Hi greatly appreciate your code and your research work! I tried your code and found it significantly improved open-set accuracy on multiple datasets, but I am confused by the difference between the MarginLoss in the codebase and the MarginLoss in the paper. The code "x_m = x - self.m self.s" where self.m = 0.2 and self.s = 10, accelerates the learning process of seen classes instead slowing down seen class learning and waiting for unseen class clustering during the begining epochs. From my understanding it should be "x_m = x + self.m self.s". Am I right about this point?

snap-stanford / orca

MarginLoss different from the paper #12