IDEA-FinAI / ToG

This is the official github repo of Think-on-Graph. If you are interested in our work or willing to join our research team in Shenzhen, please feel free to contact us by email (xuchengjin@idea.edu.cn)
238 stars 26 forks source link

results file for datasets #12

Open alomrani opened 5 months ago

alomrani commented 5 months ago

Hi there,

After running your code, I am getting 67 EM for WebQSP with gpt-3.5-turbo compared to 76 EM reported in the paper. I was wondering if you can share your results file for comparison.

Thanks, Mohammad

liyichen-cly commented 5 months ago

Hi,

I am experiencing a similar issue, with my results hovering around 0.69 for WebQSP and 0.37 for CWQ. I would greatly appreciate it if the authors could provide some insight into the challenges of reproducing the results.

Best regards, Liyi

zh-qifan commented 4 months ago

Hi,

I am also facing the same issue. I run the experiment for CWQ twice and got around 37% accuracy for gpt-3.5, compared to 57.1% mentioned in the paper. Could you please provide some suggestions in reproducing the result of the paper.

Best, Qifan

GasolSun36 commented 4 months ago

Hi, Sorry for the late reply, we did not save the previous results, but here are some tips to reproduce the results of the paper:

  1. The current version of the eval.py file has some problems, and we will fix them as soon as possible.
  2. The chatgpt model we use is gpt-3.5-turbo-0613, and the performance may fluctuate slightly from the current updated model.
  3. CWQ test is a file with alias that we built and will be updated later.
liyichen-cly commented 4 months ago

Thank you very much for your reply! I have already corrected the retrieval code and adjusted the version of ChatGPT. However, my experimental results did not improve much and are similar to the previous ones. I hope the alias file can be provided and the eval file can be corrected for reproduction as soon as possible.

Best, Liyi

willer-lu commented 3 months ago

Hi, Sorry for the late reply, we did not save the previous results, but here are some tips to reproduce the results of the paper:

  1. The current version of the eval.py file has some problems, and we will fix them as soon as possible.
  2. The chatgpt model we use is gpt-3.5-turbo-0613, and the performance may fluctuate slightly from the current updated model.
  3. CWQ test is a file with alias that we built and will be updated later.

Which version of GPT-4 did you use?

GasolSun36 commented 3 months ago

Hi, Sorry for the late reply, we did not save the previous results, but here are some tips to reproduce the results of the paper:

  1. The current version of the eval.py file has some problems, and we will fix them as soon as possible.
  2. The chatgpt model we use is gpt-3.5-turbo-0613, and the performance may fluctuate slightly from the current updated model.
  3. CWQ test is a file with alias that we built and will be updated later.

Which version of GPT-4 did you use? Hi,

We use gpt-4-0613 for all the experiments setting.

yindahu87 commented 1 month ago

非常感谢您的回复!我已经更正了检索代码并调整了 ChatGPT 的版本。然而,我的实验结果并没有太大的改善,并且与以前的结果相似。我希望可以提供别名文件,并且可以尽快更正 eval 文件以进行复制。

最好的,丽艺

你好,我在复现代码的过程中遇到了一些困难,你能指点我一下吗 感谢

youngsasa2021 commented 1 month ago

非常感谢您的回复!我已经更正了检索代码并调整了 ChatGPT 的版本。然而,我的实验结果并没有太大的改善,并且与以前的结果相似。我希望可以提供别名文件,并且可以尽快更正 eval 文件以进行复制。 最好的,丽艺

你好,我在复现代码的过程中遇到了一些困难,你能指点我一下吗 感谢

你好,我也在复现过程中遇到了一些问题,可以一起交流一下吗?非常感谢