openreasoner openr issues

openreasoner / openr

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

https://openreasoner.github.io/

MIT License

1.07k stars 79 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

What's the difference between `openr/data` and `openr/data/omegaPRM_v2`?

#61 ekonwang opened 9 hours ago
0
7B模型RL训练需要多少显存

#60 linyaoyang opened 1 day ago
1
some questions about vanila_mcts

#59 wphtrying opened 3 days ago
1
sh scripts/eval/cot_greedy.sh

#58 Tantao122200 closed 3 days ago
0
Will this project support prm training of soft label?

#57 Dada-Cloudzxy opened 1 week ago
1
Officially remove OmegaPRM-v1

#56 gzqaq opened 1 week ago
2
讨论微信群二维码已过期

#55 Siegfried-qgf closed 1 week ago
1
Some possible errors in OmegaPRMv2 / omegaprm.py

#54 FanqingM closed 1 week ago
1
Some improvement on [Reasoning]

#53 YanSong97 opened 1 week ago
0
Add support for preprocessing omega-prm-v2 datasets

#52 gzqaq closed 4 days ago
2
Save omega-prm-v2 results to a single file

#51 gzqaq closed 1 week ago
0
Json decoding failed. A bug with about a 20% chance of occurring.

#50 Dada-Cloudzxy opened 1 week ago
9
Is this normal? MCTS is worse than the COT method.

#49 Dada-Cloudzxy opened 1 week ago
13
Possible Out of Index bug when reasoning with Qwen

#48 Dada-Cloudzxy opened 1 week ago
4
Preprocess for omega-prm-v2

#47 gzqaq closed 1 week ago
0
How can I change the data format to preprocess the data generated by data/omegaPRM_v2

#46 FanqingM closed 4 days ago
5
Where the finetuned prm , (eg qwen, or llama), is leveraged?

#45 mustardBloom opened 2 weeks ago
0
Support self-refining Critic-MCTS

#44 YanSong97 opened 2 weeks ago
0
支持使用多机多卡以训练 70B+ 的 PRM吗？

#43 banksy23 opened 2 weeks ago
1
Support LLM-guided Self-Refinement MCTS

#42 YanSong97 opened 2 weeks ago
1
Fail to load the provided Math-psa reward model

#41 YanSong97 closed 2 weeks ago
1
Update data generation to support API models

#40 DylanLi-Hang closed 2 weeks ago
3
mismatch dimension on reasoning when num_com>=2

#39 JingerAI closed 2 weeks ago
10
How much video memory is required for a single card to run two models？

#38 reedest7 opened 3 weeks ago
2
微信不能扫码加入了

#37 chinoll closed 3 weeks ago
2
请问数据集的来源是什么，请问可以提供微调过的mistral-7b-sft、math-shepherd-mistral-7b-prm模型吗

#36 Brainth opened 3 weeks ago
1
docs: add Japanese README file

#35 eltociear opened 3 weeks ago
0
Unstable loss when running finetune_qwen. py

#34 Quinn777 opened 3 weeks ago
0
Implement rStar MCTS inference PR #30

#33 YanSong97 closed 2 weeks ago
0
微信群二维码过期了，谢谢

#32 ShuoZheLi closed 3 weeks ago
0
preprocessing for prm800k is unavailable

#31 TURLEing closed 3 weeks ago
0
[Reason] Support rStar MCTS

#30 YanSong97 closed 2 weeks ago
0
About the test-time computation experiments on search methods

#29 ycjing closed 3 weeks ago
3
AttributeError: '_OpNamespace' '_C' object has no attribute 'rms_norm'

#28 jeffyeylw closed 4 weeks ago
6
Need Helps? Click Here!

#27 YanSong97 opened 1 month ago
0
Wrong candidate tokens and wrong corresponding logits and scores

#26 ljb121002 opened 1 month ago
1
Reward token recognition

#25 ljb121002 opened 1 month ago
0
[WIP] Implement vanila mcts

#24 ziyuwan closed 4 weeks ago
2
Data generation issues

#23 ccp123456789 opened 1 month ago
5
decouple RM format_str and LM format_str

#22 ziyuwan closed 1 month ago
0
fix(typo): fix argparser typo

#21 00INDEX closed 1 month ago
0
Does the training support standalone multi-card, distributed and larger models like qwen2.5 72b?

#20 wphtrying opened 1 month ago
3
Unable to execute create_service_math_shepherd successfully

#19 rocky-lq closed 3 weeks ago
4
When I execute sh scripts/eval/cot_greedy.sh, I get an error `requests.exception.MissingSchema: Invalid URL '/worker_generate': scheme not provided. Perhaps you mean https:///worker_generate?`

#18 Brainth closed 1 week ago
17
Has there been any attempt to replace Qwen2.5-Math-RM-72B as a Reward Model with another relatively small model?

#17 jeffyeylw closed 1 month ago
1
decouple policy_format_str and prm_format_str

#16 ziyuwan closed 1 month ago
0
Small bugs about string post-processing in RMRemoteCaller

#15 ziyuwan closed 1 month ago
0
训练强化学习RL Training（train_math.py）报错

#14 ChenLong-UCAS closed 1 month ago
5
questions about Training using train_llm.sh

#13 wphtrying opened 1 month ago
3
Tutorial typo

#12 LIO-H-ZEN closed 2 weeks ago
1