issues
search
MARIO-Math-Reasoning
/
Super_MARIO
MIT License
254
stars
16
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Which model should be used for the sampling of the first round of MCTS training data?
#27
nomadlx
opened
1 day ago
1
About the data format in different rounds
#26
annisamansa
opened
1 week ago
2
A question about value update
#25
MlSAKA-MlKOTO
opened
2 weeks ago
1
How to train an instruct model?
#24
jt4n
opened
3 weeks ago
2
What's the training data
#23
sjtuytc
closed
4 weeks ago
1
Potential issue
#22
sfc-gh-zhyao
closed
4 weeks ago
4
Potential Wrong Logic
#21
sfc-gh-zhyao
closed
1 month ago
2
Error occured at SFT training
#20
jt4n
closed
1 month ago
10
Problem running evaluation
#19
vgaraujov
closed
4 weeks ago
1
Issue about requirements file
#18
LePanda026
closed
2 months ago
3
type of template for training
#17
vgaraujov
closed
3 months ago
8
About Training data generation.
#16
George-Chia
closed
3 months ago
2
Possible problems about the training dataset
#15
FlyingDutchman26
closed
5 months ago
2
Why are the batch size and number of epochs much larger than common SFT settings?
#14
tongyx361
closed
5 months ago
3
Is the model initialized from pre-trained model or model from the last iteration round for each round?
#13
tongyx361
closed
5 months ago
2
Why not directly generate the value, but instead add a value head? Could you explain the reasoning behind this decision?
#12
yanzhenqiang
closed
5 months ago
1
value estimation twice?
#11
platoonpluto
closed
5 months ago
5
AttributeError: 'RequestOutput' object has no attribute 'value_estimate'
#10
yanzhenqiang
closed
6 months ago
1
MCTS training data generation in round1
#9
platoonpluto
closed
6 months ago
1
training code
#8
jordane95
closed
6 months ago
2
How to set B1 in Step level Beam Search
#7
xiaolizh1
closed
6 months ago
3
How to initialize first generation child nodes?
#6
Jeff123z
closed
6 months ago
1
Update solver_demo.py
#5
eltociear
closed
6 months ago
0
数学推理本身是个非对称二元博弈问题
#4
hxypqr
closed
6 months ago
3
About the code
#3
liushz
closed
6 months ago
3
AlphaMath listed as AlaphaMath in Huggingface
#2
1of13
closed
6 months ago
1
Concern on (first few rounds) sampling efficacy
#1
billxbf
closed
6 months ago
3