Closed lchiaying closed 3 years ago
This behavior is intended -- negative examples are generated by either picking two random graphs, or taking a positive example and perturbing it eg by adding edges so that it likely is no longer a subgraph. These manipulations result in two graphs that are not likely (but may still be) subgraph isomorphic, and as long as most negative examples are correct, the model is still able to learn an accurate embedding space in practice. It could be good practice to confirm that most examples are true negatives on a given dataset by running exact subgraph isomorphism tests on the generated examples; testing on the imbalanced data sources can also help validate the model since those examples are generated with exact subgraph isomorphism tests.
I have been noticing empirically that the percentage of incorrect negative pairs is not small at all. In fact, all the hard negative examples are actually subgraphs. I think line 151 is the culprit---the condition is never satisfied in order to execute the perturbation. Shouldn't it be
if use_hard_neg or train
instead of
if use_hard_neg and train
?
A related question, which is that adding edges doesn't guarantee that the new graph isn't a subgraph; it could still be isomorphic to a subgraph induced by a different set of nodes, right?
Line 151 is intended as is -- hard negatives are only used in training rather than val/test. Hmm, wondering how you are evaluating whether the two graphs are subgraph isomorphic? In the node-anchored setting, the two graphs may be subgraphs when not considering the anchor, but not subgraphs when considering the anchor? When running python3 -m subgraph_matching.train --node_anchored
(i.e. default settings with the synthetic dataset) on my end, I'm getting use_hard_neg equal to True for some negative pairs (only a random subset are chosen to be hard negatives), and for those negatives, computing subgraph isomorphism between graph
and neigh
returns False 80-90% of the time. Let me know if maybe I missed something or you're trying a different setting/evaluation!
Yes, the strategies are heuristics and may not guarantee that the new graph is not a subgraph, so I agree that testing empirically is a good check.
Thanks for clarifying that hard negatives are intended to be used only for training. I was running the test.py module through the debugger, and looking at test_pts, which was what lead me to notice the questions I asked above.
My first attempt at test.py gave me about 60% negatives being false negatives (GraphMatcher.subgraph_is_isomorphic returns True). I realized that it was because even in test mode, Line 181 chose half the indices to be hard negatives, Line 184 set those initial query graph to be exactly the target graph, and then in sample_subgraph it sampled a subgraph from the initial query graph but never executed the code to manipulate it into a non-subgraph of target graph. So that accounts for 50% of the negatives being false negatives. For the other half that are easy negatives, that accounts for the remaining 10% of false negatives.
Ah gotcha, thanks for investigating! Hope to better document the behavior of hard negatives and revamp that section of code to make it more clear.
When running "syn" dataset, I have been getting negative pairs in test_pts that are actually positive pairs. If I'm not mistaken, in line 165 to 169 of data.py, is where the sampled negative query are actually checked whether it forms a negative pair with the corresponding target. However, it seems that the "if" condition in line 165 is never triggered, so the sampled query is never actually checked to ensure it is not a subgraph of target graph.
Is that really the case, or am I missing something?