OFA-Sys / gsm8k-ScRel

Codes and Data for Scaling Relationship on Learning Mathematical Reasoning with Large Language Models
https://arxiv.org/abs/2308.01825
219 stars 16 forks source link

Release RFT datasets #4

Closed nuochenpku closed 1 year ago

nuochenpku commented 1 year ago

Could you please directly release datasets for RFT that contains various reasoning paths?

GanjinZero commented 1 year ago

please see data/rft

nuochenpku commented 1 year ago

Get it, thanks for your quick response. But I have another question: Do you consider that LLMs could generate the correct answer whilt the intermediate steps are wrong. Do you filter reasoning paths that feeds the above situation?

GanjinZero commented 1 year ago

All paths with wrong calculations are removed based on python eval function. If one path is incorrect and has correct calculation and answer, I have no way to detect and remove it without a process-level reward model.

nuochenpku commented 1 year ago

Thanks a lot!---- Replied Message ----FromZheng @.>Date08/06/2023 18:53 @.> CcJerry @.>@.>SubjectRe: [OFA-Sys/gsm8k-ScRel] Release RFT datasets (Issue #4) All paths with wrong calculations are removed based on python eval function. If one path is incorrect and has correct calculation and answer, I have no way to detect and remove it without a process-level reward model.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.> [ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "https://github.com/OFA-Sys/gsm8k-ScRel/issues/4#issuecomment-1666813904", "url": "https://github.com/OFA-Sys/gsm8k-ScRel/issues/4#issuecomment-1666813904", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.***": "Organization", "name": "GitHub", "url": "https://github.com" } } ]