Closed FanqingM closed 4 days ago
On the other hand, if I choose save_tree, I get that from OmegaPRMv2: "question": "In rectangle $ABCD$, $AB=100$. Let $E$ be the midpoint of $\overline{AD}$. Given that line $AC$ and line $BE$ are perpendicular, find the greatest integer less than $AD$.\n", "final_answer": "141", "reasoning_steps": { "text": "In rectangle $ABCD$, $AB=100$. Let $E$ be the midpoint of $\overline{AD}$. Given that line $AC$ and line $BE$ are perpendicular, find the greatest integer less than $AD$.", "mc_value": 0.9375, "children": [ { "text": "xxx", "mc_value": 1.0, "children": [] },
I wonder how to convert it to adapt to the input data for preprocess
It seems that the input dataset for prm training(Math-PSA) is openr/prm/code/test.json How can I get this data? It seems that I generated data from OmePRMV2 is not like this
For data generated by OmegaPRM_v2
, two formats are available:
Flat Format (save_data_tree=False
):
Each entry is structured as:
{ "solution_prefix": [Q, x_1:x_i], "mc_value": 0.5 }
where i
is a variable representing the number of reasoning steps. This format provides a linear view of the reasoning process without hierarchical structure.
Tree Format (save_data_tree=True
):
In this format, data is organized as a tree structure, aligned with the figure presented in the paper. Each reasoning step (or node) includes:
With these two formats, you should have the flexibility to preprocess the data in ways that best suit your custom training needs.
Please wait for us to add support for preprocessing data generated by OmegaPRM-v2. Progress is tracked in #47.
Reopen because #47 was accidentally merged. Progress is tracked in the new PR #52.
System Info
Same with this repo.
Who can help?
@mengfn, @gzqaq, @Gebro13, @YanSong97, @iamlilAJ
Information
Tasks
Reproduction
First I generated data with data/omegaPRM_v2 as below: "question": "In three-dimensional space, find the number of lattice points that have a distance of 3 from the origin.\n\nNote: A point is a lattice point if all its coordinates are integers.", "final_answer": "30", "reasoning_steps": [ { "solution_prefix": "xx", "mc_value": 0.5 }, { "solution_prefix": "xx", "mc_value": 1.0 }...
Next, I want to train the PRM model use prm/code/finetune_qwen.py. It seems that I should preprocess the data above. I notice in preprocess, and the author give an example in tests/samples_math-aps.json as below: [{"q": "Find the sum of all integers $k$ such that $\binom{23}{4} + \binom{23}{5} = \binom{24}{k}$.", "states": [{"rollout": " To solve the equation \(\binom{23}{4} + \binom{23}{5} = \binom{24}{k}\), we start by usin g the identity for binomial coefficients that states \(\binom{n}{r} + \binom{n}{r+1} = \binom{n+1}{r+1}\). Here, \(n = 23\) and \(r = 4\), so we have:\n\n\[\n\binom{23}{4} + \binom{23}{5} = \binom{24}{5}\n\]\n\ nThis means that \(\binom{24}{k} = \binom{24}{5}\). The binomial coefficient \(\binom{24}{k}\) is symmetric, i.e., \(\binom{24}{k} = \binom{24}{24-k}\).", "state": "", "mcs": 0.6307692307692307},....
It seems that it is the input of preprocess/src/preprocessors/math_aps.py.
But there exists some difference with the data generated by OmegaPRMv2 and the data for preprocess.
I wonder how can i convert the data generated by OmegaPRMv2 to preprocess it for PRM training?
Expected behavior
I wonder how can i convert the data generated by OmegaPRMv2 to preprocess it for PRM training?