PonderLY / PC-GNN

(WWW 2021) Source code of PC-GNN
96 stars 21 forks source link

On the inconsistency of sampling methods between training stage and inference stage #10

Open Chenrj24 opened 1 year ago

Chenrj24 commented 1 year ago

Hello, author. I noticed that you used oversampling when sampling fraud nodes in the training phase, requiring you to select nodes that are consistent with the fraud node label. This procedure requires the type label of the node. However, it is not possible to obtain the node type label in the inference stage, so it is impossible to oversample the fraudulent nodes, which leads to the inconsistency between the sampling method and the training stage, and the model in the training stage is more ideal. Does this not fit the logic of machine learning? Causing the trained model to be unreliable?

AnonymousDataCodeHub commented 9 months ago

Hello, author. I noticed that you used oversampling when sampling fraud nodes in the training phase, requiring you to select nodes that are consistent with the fraud node label. This procedure requires the type label of the node. However, it is not possible to obtain the node type label in the inference stage, so it is impossible to oversample the fraudulent nodes, which leads to the inconsistency between the sampling method and the training stage, and the model in the training stage is more ideal. Does this not fit the logic of machine learning? Causing the trained model to be unreliable?

I have the same question, so I don't think it is a fair comparision in the paper when tesing in inference stage.