issues
search
simplescaling
/
s1
s1: Simple test-time scaling
https://arxiv.org/abs/2501.19393
Apache License 2.0
6.19k
stars
725
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
what is the content of "results/gemini/{subdir}/*.json"
#114
Pyjacc
opened
3 hours ago
1
Reproduce the test-time scaling of s1.1
#113
Edric-Zhao
closed
1 week ago
5
Is the code data used in the training data?
#112
xuhu0115
opened
1 week ago
4
Training time shows 18 hr on 16 * H100
#111
kartikjain-sudo
opened
1 week ago
2
Config for smaller models
#110
DoubtedSteam
opened
2 weeks ago
1
Why generate more than 8k tokens at a time?
#109
JaydencoolCC
opened
2 weeks ago
1
Issue Reproducing s1.1-32B Training Loss (Observed vs. WandB)
#108
dzh19990407
opened
2 weeks ago
8
Can I specify the thinking context of the query when s1 inferencing with vllm?
#107
Siki-cloud
opened
3 weeks ago
1
qwen2.5-32B can not fine tuning on 80GB A100 in max_seq_length=32768
#106
Gresham429
opened
3 weeks ago
4
Are incorrect responses are used for training?
#105
Ignoramus0817
closed
3 weeks ago
1
why the model file size is 129g?
#104
wikithink
opened
4 weeks ago
1
The minimum GPU resources needed to fine-tune the 32B model?
#103
JaydencoolCC
opened
4 weeks ago
1
Error in Evaluating on Other Dataset (AssertionError: min_tokens_thinking only supports until_thinking tokens that are 1 token long)
#102
jd730
opened
1 month ago
3
Exploring LoRA Adapters & Low Precision Training in S1 for Enhanced Test-Time Scaling
#101
goravaa
opened
1 month ago
0
Missing thinking tokens flag for no budget forcing eval
#100
kothasuhas
opened
1 month ago
2
Question about Idavidrein/gpqa data source in s1k-1.1 dataset
#99
czczup
closed
1 month ago
2
[Evaluating trained model] safetensors_rust.SafetensorError: Error while deserializing header: MetadataIncompleteBuffer
#98
jd730
closed
1 month ago
1
s1-32b keeps generating same trajectories and final answer regardless of changing of seeds
#97
nichenshun
opened
1 month ago
1
Experimental communication
#96
WYH1597650869
opened
1 month ago
1
ValueError: please provide at least one prompt
#95
TikaToka
opened
1 month ago
7
Special tag in "text" column of simplescaling/s1K-1.1_tokenized
#94
JH-ninjatech
closed
1 month ago
3
Token-conditonal control code?
#93
kyZhao-1
opened
1 month ago
1
How is the AIME24 dataset of 30 rows created ?
#92
Stefanie-Anna
opened
1 month ago
1
About Figure4 (a) and Table 1
#91
Aegis1863
opened
1 month ago
3
Distilling from Claude 3.7
#90
jonathanyin12
closed
1 month ago
1
Question on rejection sampling and Fig 6 in paper.
#89
lihkinVerma
opened
1 month ago
1
Diff for lm-evaluation-harness changes?
#88
akhauriyash
closed
1 month ago
1
Is OPENAI_API_KEY necessary?
#87
kyZhao-1
opened
1 month ago
3
Any plan about s1.1 data release
#86
ruleGreen
closed
1 month ago
1
Gemini thinking flash API no longer returns thoughts and response separately
#85
SusMaria
opened
1 month ago
1
groundtruth answer of s1 dataset
#84
jiayuww
opened
1 month ago
1
what is the reason set pad_token to unused token?
#83
blackcherry88
opened
1 month ago
4
Question about evaluation
#82
ian00000
closed
1 month ago
4
How do we verify the generated proof?
#81
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#80
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#79
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#78
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#77
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#76
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#75
agoyang
closed
1 month ago
0
How do we verify the generated proof?
#74
agoyang
opened
1 month ago
1
Script to generate traces via DeepSeek R1 seems to be missing
#73
nileshtrivedi
opened
1 month ago
1
Inference Token Inclusion and SFT Question Loss Calculation Queries
#72
ruio248
opened
1 month ago
1
Less than 1000 samples after "Selected benchmark related data"
#71
tonychenxyz
opened
1 month ago
5
release model response
#70
wanghanbinpanda
closed
1 month ago
2
[Question or Exploratory Analysis] Frequency Domain Insights for Thinking Trajectories Enhancement
#69
SamYuan1990
opened
1 month ago
0
output of data/featurization.py
#68
tonychenxyz
opened
1 month ago
4
filter.ipynb missing
#67
tonychenxyz
closed
1 month ago
0
Model context length vs output/thinking length
#66
jonathanyin12
closed
1 month ago
1
Chinese characters in the returned response
#65
JH-ninjatech
closed
1 month ago
3
Next