Question about result - Githubissues

Zoro1092000 commented 2 years ago

Hi, thank for your repo. I have 1 question, usually when running the model on Chord, Leet, Debru, Kadem, C2 datasets, the results are the same as you described in the README file but the results when I run the model on the effective P2P dataset model yield >= 99.1% (higher than the data you announced 98.692% F1-score). I didn't change your settings, I just removed the fill value and changed some versions of the libraries so the code could run. In short, I want to ask why the model when I run it again gives significantly better results? Thanks!

jzhou316 commented 2 years ago

Hi thanks for your observation! The library does come with some small randomness, resulting from non-deterministic behaviors from such as PyTorch (https://pytorch.org/docs/stable/notes/randomness.html) and pytorch_scatter (https://github.com/rusty1s/pytorch_scatter/issues/226), depending platforms and environment that can only be limited. My guess is that there might also be some changes in between the package versions that can lead to the small differences. Also do you observe the same number between different runs, or there is also small variations in between runs?

Zoro1092000 commented 2 years ago

I also guess the same as you. Thank a lot!

harvardnlp / botnet-detection

Question about result #24