Closed WilliamKRobert closed 3 years ago
Hey, generally speaking, I would expect supervised methods to be able to do better on a test set that comes from a similar distribution as the labeled data. This seems to be the case in the paper by Dai et al. as well, who have compared with supervised methods.
The goal with our work is to have something that works in the absence of labels, because as you scale up the instances, getting labels becomes very expensive in combinatorial optimization problems.
Thanks for your clarification!
Hi, thank you for sharing your wonderful work! The formulation of loss function as described in your paper looks great. Assuming I do have the ground truth labels at training for maximum clique problem, does your loss function outperform the more common cross-entropy loss? Should I consider using your loss function even if I have the ground truth labels?