williamleif / GraphSAGE

Representation learning on large graphs using stochastic graph convolutions.
Other
3.41k stars 841 forks source link

problem concerning ppi_eval.py #55

Open shuaishuaij opened 5 years ago

shuaishuaij commented 5 years ago

Hi, I download your code and tried the unsupervised training. When I run the ppi_eval.py strictly following the instructions, it output some F1 Score far different from the scores shown on your paper, for examples, some of them look like these F1 score 0.6524257784214338 F1 score 0.7677407675597393 F1 score 0.7941708906589428 F1 score 0.7668356263577119 F1 score 0.8696596669080376 F1 score 0.8081100651701665 F1 score 0.7624909485879797 the only thing I changed in the code is that I replaced dict.iteritems() to dict.items(), which I think won't be the real problem. I wonder if there is something wrong? Are the scores "Micro F1" or "Macro F1" on your paper?

default parameters + mean aggregator + unsupervised training on ppi dataset(not the toy)

RexYing commented 5 years ago

Does that mean you obtain higher F1 compared to what we reported in the paper? We only got ~0.6. I'm not too sure what's going on. Potentially due to different train/test split? We use Micro F1.

But that said, some followup papers (FastGCN, GAT etc.) seem to get significantly higher results on PPI (0.9).

michibertoldi commented 5 years ago

Hi, I download your code and tried the unsupervised training. When I run the ppi_eval.py strictly following the instructions, it output some F1 Score far different from the scores shown on your paper, for examples, some of them look like these F1 score 0.6524257784214338 F1 score 0.7677407675597393 F1 score 0.7941708906589428 F1 score 0.7668356263577119 F1 score 0.8696596669080376 F1 score 0.8081100651701665 F1 score 0.7624909485879797 the only thing I changed in the code is that I replaced dict.iteritems() to dict.items(), which I think won't be the real problem. I wonder if there is something wrong? Are the scores "Micro F1" or "Macro F1" on your paper?

default parameters + mean aggregator + unsupervised training on ppi dataset(not the toy)

how did you succeed in running the unsupervised training? When I try I got "Killed" after "Done loading training data". Thanks in advice

shuaishuaij commented 5 years ago

Hi, I download your code and tried the unsupervised training. When I run the ppi_eval.py strictly following the instructions, it output some F1 Score far different from the scores shown on your paper, for examples, some of them look like these F1 score 0.6524257784214338 F1 score 0.7677407675597393 F1 score 0.7941708906589428 F1 score 0.7668356263577119 F1 score 0.8696596669080376 F1 score 0.8081100651701665 F1 score 0.7624909485879797 the only thing I changed in the code is that I replaced dict.iteritems() to dict.items(), which I think won't be the real problem. I wonder if there is something wrong? Are the scores "Micro F1" or "Macro F1" on your paper? default parameters + mean aggregator + unsupervised training on ppi dataset(not the toy)

how did you succeed in running the unsupervised training? When I try I got "Killed" after "Done loading training data". Thanks in advice

Anymore error message? I think the "killed problem" may not come from the model. Did you change any hyper-parameter?

Haicang commented 4 years ago

Hi, I download your code and tried the unsupervised training. When I run the ppi_eval.py strictly following the instructions, it output some F1 Score far different from the scores shown on your paper, for examples, some of them look like these F1 score 0.6524257784214338 F1 score 0.7677407675597393 F1 score 0.7941708906589428 F1 score 0.7668356263577119 F1 score 0.8696596669080376 F1 score 0.8081100651701665 F1 score 0.7624909485879797 the only thing I changed in the code is that I replaced dict.iteritems() to dict.items(), which I think won't be the real problem. I wonder if there is something wrong? Are the scores "Micro F1" or "Macro F1" on your paper?

default parameters + mean aggregator + unsupervised training on ppi dataset(not the toy)

From the history of that file, I find they used one line of code to compute the micro f1 score. Compared with implementations of other graph embedding methods, I think the previous code is what people use to get the metric.

But I'm not sure how should I compute f1 score for PPI, which has multi-labels as output.

zeou1 commented 3 years ago

Hi, I download your code and tried the unsupervised training. When I run the ppi_eval.py strictly following the instructions, it output some F1 Score far different from the scores shown on your paper, for examples, some of them look like these F1 score 0.6524257784214338 F1 score 0.7677407675597393 F1 score 0.7941708906589428 F1 score 0.7668356263577119 F1 score 0.8696596669080376 F1 score 0.8081100651701665 F1 score 0.7624909485879797 the only thing I changed in the code is that I replaced dict.iteritems() to dict.items(), which I think won't be the real problem. I wonder if there is something wrong? Are the scores "Micro F1" or "Macro F1" on your paper?

default parameters + mean aggregator + unsupervised training on ppi dataset(not the toy)

I am also getting much higher F1 scores for each class, and I am unsure how to calculate the single F1 score for PPI as presented in Table 1 of the paper.