issues
search
EleutherAI
/
gpt-neox
An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries
https://www.eleuther.ai/
Apache License 2.0
6.95k
stars
1.02k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
TE Import Hotfix
#1272
Quentin-Anthony
closed
2 months ago
1
Hotfix Activation Typo
#1271
Quentin-Anthony
closed
2 months ago
0
Formatting and Fix Mamba Config
#1270
Quentin-Anthony
closed
2 months ago
0
LayerNorm Refactor
#1269
aurelion-source
closed
2 months ago
3
Allow training without knowing num_iters
#1268
StellaAthena
closed
1 week ago
1
Add assert to check for missing tokenizer_type in config. [#1231]
#1267
AI-WAIFU
closed
2 months ago
1
Add initial ring flash attention support
#1266
dmahan93
opened
2 months ago
1
add Apex fused RMS norm
#1265
dmahan93
closed
2 months ago
1
Frontier
#1264
jahatef
closed
3 months ago
1
Improve performance of sequence parallel gather, scatter, and reduce
#1263
bclyang
closed
3 months ago
0
mamba fixes and cleaning
#1262
jahatef
closed
2 months ago
2
Comet integration
#1261
jverre
closed
2 months ago
2
Fix gather and reduce scatter ops on sequence dimension
#1260
bclyang
closed
3 months ago
0
Fix LayerNorm all reduce gradient hook
#1259
bclyang
closed
3 months ago
1
bugfix: chat turns instead of repeating the conversation in preprocess_data_with_chat_template.py
#1258
dmahan93
closed
3 months ago
0
Megatron-LM style Sequence Parallel
#1257
haileyschoelkopf
closed
3 months ago
3
GitHub actions fix
#1256
jahatef
closed
3 months ago
0
Add new cites
#1255
StellaAthena
closed
3 months ago
1
How to Load Model from pytorch_model.bin into Trained Model for Text Generation?
#1254
lieh1203
opened
4 months ago
0
what's the biggest dataset you've tried?
#1253
exnx
opened
4 months ago
0
too many .bin files for dataloader, crashed
#1252
exnx
closed
4 months ago
0
Assertion Error when Setting pipe_parallel_size or model_parallel_size in GPT-NeoX
#1251
lieh1203
opened
4 months ago
3
For nucleus sampling, top-p sampling appears to happen on the softmax-normalized top-k logits
#1250
j-frei
closed
2 months ago
3
batch_input and elapsed time per iteration suddenly slow down during model training
#1248
Yuhanleeee
opened
4 months ago
4
Add hf llama to neox conversion
#1247
dmahan93
closed
3 months ago
1
Add Reward Model training
#1246
dmahan93
closed
2 months ago
0
Conversion for CI from self-hosted hardware
#1245
jaimemcc-intel
closed
3 months ago
0
Add KTO training
#1244
dmahan93
closed
2 months ago
0
Replace unsafe `pyyaml` loader with `SafeLoader` (#2)
#1243
pixeeai
closed
2 months ago
1
Add DPO training
#1242
dmahan93
closed
2 months ago
1
Fix paper reference in init_functions.py
#1241
rasbt
closed
4 months ago
2
SFT improvements (labeling fixes, different packing implementations)
#1240
dmahan93
closed
2 months ago
0
Add a chat data preprocessing script
#1239
dmahan93
closed
5 months ago
0
Pr1212
#1238
jahatef
closed
5 months ago
0
Add tensor parallelism for RWKV
#1237
jahatef
opened
5 months ago
0
Ville dev
#1236
Vmjkom
closed
5 months ago
1
Add Transformer Engine's version of RMSNorm and LayerNorm
#1235
lintangsutawika
closed
2 months ago
2
fix python version and pytest install
#1234
jahatef
closed
5 months ago
5
add workflow_dispatch to gh actions pr so we can run on command
#1233
jahatef
closed
5 months ago
0
init changes to README
#1232
jaimemcc-intel
closed
5 months ago
0
Cannot convert neox model to HF
#1231
srivassid
opened
5 months ago
2
How to set the ffn hidden size parameter in gpt neox
#1230
IronMan-WangJinxi
closed
2 months ago
2
Cannot perform inference, be it unconditional. input-file or interactive
#1228
srivassid
closed
5 months ago
2
The results of running eval show only 1 digit after decimal point for acc on all tested tasks
#1227
lernerjenny
closed
4 months ago
2
Add Torch Profiler Support
#1226
DayOfThePenguin
closed
6 months ago
0
Add lora support
#1225
mkerin
opened
6 months ago
0
fixed fused_rope naming in JIT + Readme
#1224
R0n12
closed
6 months ago
0
Change python invocation syntax
#1223
jaimemcc-intel
closed
5 months ago
0
Small tidying
#1222
yang
closed
6 months ago
0
Rwkv pipeline parallelism
#1221
jahatef
closed
6 months ago
1
Previous
Next