Closed adamoyoung closed 4 years ago
Corrected loss function to make it consistent with the paper: it is now computing the log sum of probabilities for each of the node orderings instead of the log product.
Corrected loss function to make it consistent with the paper: it is now computing the log sum of probabilities for each of the node orderings instead of the log product.