iterative strategy - Githubissues

JUN000317 commented 1 month ago

Hello, after using your iterative strategy, the performance of my model has decreased, and I am unable to achieve the results mentioned in the paper. Could you please help me identify where the issue might be? I would greatly appreciate your assistance. Wishing you good health and success in your work.

wyy-code commented 1 month ago

Thank you for your interest in our work and for your efforts in replicating the results. I have re-run the code from the GitHub repository on an A6000 GPU and observed that the H@10 metric did drop in some models on the 100K dataset. This accuracy drop might be due to over-smoothing of the embeddings, as discussed in the paper. When the overall number of seeds is high, multiple iterations of the training strategy can lead to redundant propagation of seed priors. Meanwhile, the seed setting and hardware variations can significantly impact results, which may explain the issue.

A potential solution is to exclude the potential isomorphism propagation from the iterative training strategy and instead apply Pip after training the model and obtaining the embeddings. Additionally, the encoder choice is crucial; PipEA may not be suitable for certain models.

JUN000317 commented 1 month ago

Thank you for taking the time out of your busy schedule to respond. I wanted to clarify my situation. I attempted to replicate the results from your paper using the DW15K and ENDE15K datasets. However, I noticed that during the second iteration, the performance metrics such as @., @., and MRR all showed a decline. For example, with the DW15K dataset, the @.*** metric was 75% without using the iterative strategy, but after applying the iterative strategy, it dropped to 69%. Could you please advise on what might be causing this issue?

Additionally, you mentioned in your previous reply to exclude potential isomorphism propagation from the iterative training strategy and instead apply Pip after training the model and obtaining the embeddings. I will try the solution you provided.

Once again, thank you for your help. Your insights have been invaluable to me. I wish you good health and continued success in your work.

Best regards,

Yuanyi Wang @.***> 于2024年8月15日周四 01:05写道：

Thank you for your interest in our work and for your efforts in replicating the results. I have re-run the code from the GitHub repository on an A6000 GPU and observed that the @.*** metric did drop in some models on the 100K dataset. This accuracy drop might be due to over-smoothing of the embeddings, as discussed in the paper. When the overall number of seeds is high, multiple iterations of the training strategy can lead to redundant propagation of seed priors. Meanwhile, the seed setting and hardware variations can significantly impact results, which may explain the issue.

A potential solution is to exclude the potential isomorphism propagation from the iterative training strategy and instead apply Pip after training the model and obtaining the embeddings. Additionally, the encoder choice is crucial; PipEA may not be suitable for certain models.

— Reply to this email directly, view it on GitHub https://github.com/wyy-code/PipEA/issues/2#issuecomment-2289332959, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKQB5ZXW4VPKHXQZK4NHZP3ZROE6RAVCNFSM6AAAAABMNTVJJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBZGMZTEOJVHE . You are receiving this because you authored the thread.Message ID: @.***>

wyy-code commented 2 weeks ago

Apologies for the delayed response, and thank you for your efforts in testing. I suspect the following could be contributing factors:

Convergence : It seems that the model may not be converging well in the early iterations, which could lead to significant prediction errors and subsequent error propagation after applying the iterative strategy. I recommend increasing the number of epochs in the first round to ensure better fitting.
Irrelevant Information in different Graph: As I mentioned before, massive Pip might cause non-starved entities in another graph to accumulate too much irrelevant cross-graph neighbor information. Separating these Pip after the iterative strategy might help mitigate this issue.
Refinement Strategy Convergence: Another potential issue could be with the convergence of the refinement strategy. Since the train_pair is updated continuously after applying the iterative strategy, the initial values for the refinement strategy also keep changing. However, the initial distribution of refinement iterations changed and the refinement epoch set in the file may be insufficient for convergence. You might want to adjust the k parameter in refina() to increase the number of refinement iteration.

I hope this answer helps!

Best regards,

JUN000317 commented 1 week ago

Thank you for your reply, I tried the method you provided, your reply is very helpful to me, thank you very much!

Yuanyi Wang @.***> 于2024年9月2日周一 15:16写道：

Apologies for the delayed response, and thank you for your efforts in testing. I suspect the following could be contributing factors:

1.

Convergence : It seems that the model may not be converging well in the early iterations, which could lead to significant prediction errors and subsequent error propagation after applying the iterative strategy. I recommend increasing the number of epochs in the first round to ensure better fitting. 2.

Irrelevant Information in different Graph: As I mentioned before, massive Pip might cause non-starved entities in another graph to accumulate too much irrelevant cross-graph neighbor information. Separating these Pip after the iterative strategy might help mitigate this issue. 3.

Refinement Strategy Convergence: Another potential issue could be with the convergence of the refinement strategy. Since the train_pair is updated continuously after applying the iterative strategy, the initial values for the refinement strategy also keep changing. However, the initial distribution of refinement iterations changed and the refinement epoch set in the file may be insufficient for convergence. You might want to adjust the k parameter in refina() to increase the number of refinement iteration.

I hope this answer helps!

Best regards,

— Reply to this email directly, view it on GitHub https://github.com/wyy-code/PipEA/issues/2#issuecomment-2323986954, or unsubscribe https://github.com/notifications/unsubscribe-auth/BKQB5ZWSVLWEIRIVVFDPHDDZUQGEPAVCNFSM6AAAAABMNTVJJ2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMRTHE4DMOJVGQ . You are receiving this because you authored the thread.Message ID: @.***>

wyy-code / PipEA

iterative strategy #2