Open wgc-research opened 5 months ago
I also encountered the same problem. I tried gpt-3.5-turbo-1106
/ gpt-3.5-turbo-0125
and found issues with the return results of many cases.
For example, gpt3:
- It seems that the provided relations do not directly contribute to the given questions. If you have any other specific relations or questions in mind, please feel free to share them, and I'd be happy to assist further.
- It seems that the knowledge triplets for the question about Helen Keller's school are missing. Please provide the relevant knowledge triplets for Helen Keller's school, and I will be happy to help answer the question.
But, if use gpt-4-1106-preview
/ gpt-4-0125-preview
, it will be much better, but you still can't reproduce the original metrics on webqsp and cwq.
Hi, The score that is not given to relation_paths is correct because we have not implemented the code to print the score in the main file. You can print it out manually. Regarding the problem that your reproduction results are too different from those of the original paper, you can take a look at this reply: https://github.com/IDEA-FinAI/ToG/issues/12#issuecomment-1961312846
How to get the relation score? What I reproduced is that when the prompt in the code is given to gpt3.5, it will only reply something like "I'm sorry" without any useful information. Has the prompt design been updated?
Hi, The score that is not given to relation_paths is correct because we have not implemented the code to print the score in the main file. You can print it out manually. Regarding the problem that your reproduction results are too different from those of the original paper, you can take a look at this reply: #12 (comment)
Hi, The score that is not given to relation_paths is correct because we have not implemented the code to print the score in the main file. You can print it out manually. Regarding the problem that your reproduction results are too different from those of the original paper, you can take a look at this reply: #12 (comment)
I've had the same problem, and the majority of responses are "im sorry" or "there is no specific information available to answer the question... " No final answer is generated.
Hi, The score that is not given to relation_paths is correct because we have not implemented the code to print the score in the main file. You can print it out manually. Regarding the problem that your reproduction results are too different from those of the original paper, you can take a look at this reply: #12 (comment)
I've had the same problem, and the majority of responses are "im sorry" or "there is no specific information available to answer the question... " No final answer is generated.
Hi, Did you install and index the Freebase or WikiData correctly? seems like there are a lot of unknown-name entity in the KG. Which model and which dataset did you run?
I also encountered the same problem. I tried
gpt-3.5-turbo-1106
/gpt-3.5-turbo-0125
and found issues with the return results of many cases.For example, gpt3:
- It seems that the provided relations do not directly contribute to the given questions. If you have any other specific relations or questions in mind, please feel free to share them, and I'd be happy to assist further.
- It seems that the knowledge triplets for the question about Helen Keller's school are missing. Please provide the relevant knowledge triplets for Helen Keller's school, and I will be happy to help answer the question.
But, if use
gpt-4-1106-preview
/gpt-4-0125-preview
, it will be much better, but you still can't reproduce the original metrics on webqsp and cwq.
Hi, I'm having the same problem with the GrailQA dataset. I'm positive I've installed and started Freebase correctly. The reason for the problem may be because there is only one In-context example in the publicly available code (one-shot, but reported as 5-shot in the paper). Can you please provide the full In-context example? Thank you very much~! @GasolSun36
Thanks for great work! I have a problem when I run the function
relation_search_prune
inmain_freebase.py
, for many questions, GPT3.5 doesn't give me any feedback about scores of relation_paths, so it often comes across as finished with "depth 1 still does not find the answer." TOG then stops at depth 1. Is it a common situation?