About the reproduction.

teacherpeterpan / Logic-LLM

The project page for "LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning"

MIT License

244 stars 38 forks source link

About the reproduction. #4

Open jumptoliujj opened 10 months ago

jumptoliujj commented 10 months ago

We run experiments on PrOntoQA and FOLIO, the results of accuracy are only about 51%~53%. We run logic_program.py, self_refinement.py, and then logic_inference.py, evaluation.py. Any wrong with my steps? Correct me.

RutaTang commented 7 months ago

Hi @jumptoliujj,

Did you figure it out? I have also faced this issue. I am not sure whether and where I was doing wrong.

jumptoliujj commented 7 months ago

Hi @jumptoliujj,

Did you figure it out? I have also faced this issue. I am not sure whether and where I was doing wrong.

Just use text-davinci-003 instead of gpt-3.5-turbo, and you will get the similar results in the paper.

RutaTang commented 7 months ago

Thank you!

zhuang-li commented 4 months ago

I have the same issue.. The logical program only obtained 51.4 on ProntoQA with GPT-3.5-turbo, which is far from the reported result (61) in the paper.

abhinandan12345678 commented 2 months ago

I tried to run the models as per the commands in readme but it is giving "Error in generating example". In the requirements.txt the changes i made are: changed certifi @ file:///croot/certifi_1671487769961/work/certifi to certifi==2022.12.7, because the url was giving error. removed sklearn as it got deprecated. Can anyone please help me regarding this issue.

wernerolaf commented 4 weeks ago

I have problems with Pyke as it is not always deterministic, also there are problems with cache