EleutherAI gpt-neox issues

EleutherAI / gpt-neox

An implementation of model parallel autoregressive transformers on GPUs, based on the Megatron and DeepSpeed libraries

https://www.eleuther.ai/

Apache License 2.0

6.95k stars 1.02k forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

TE Import Hotfix

#1272 Quentin-Anthony closed 2 months ago
1
Hotfix Activation Typo

#1271 Quentin-Anthony closed 2 months ago
0
Formatting and Fix Mamba Config

#1270 Quentin-Anthony closed 2 months ago
0
LayerNorm Refactor

#1269 aurelion-source closed 2 months ago
3
Allow training without knowing num_iters

#1268 StellaAthena closed 1 week ago
1
Add assert to check for missing tokenizer_type in config. [#1231]

#1267 AI-WAIFU closed 2 months ago
1
Add initial ring flash attention support

#1266 dmahan93 opened 2 months ago
1
add Apex fused RMS norm

#1265 dmahan93 closed 2 months ago
1
Frontier

#1264 jahatef closed 3 months ago
1
Improve performance of sequence parallel gather, scatter, and reduce

#1263 bclyang closed 3 months ago
0
mamba fixes and cleaning

#1262 jahatef closed 2 months ago
2
Comet integration

#1261 jverre closed 2 months ago
2
Fix gather and reduce scatter ops on sequence dimension

#1260 bclyang closed 3 months ago
0
Fix LayerNorm all reduce gradient hook

#1259 bclyang closed 3 months ago
1
bugfix: chat turns instead of repeating the conversation in preprocess_data_with_chat_template.py

#1258 dmahan93 closed 3 months ago
0
Megatron-LM style Sequence Parallel

#1257 haileyschoelkopf closed 3 months ago
3
GitHub actions fix

#1256 jahatef closed 3 months ago
0
Add new cites

#1255 StellaAthena closed 3 months ago
1
How to Load Model from pytorch_model.bin into Trained Model for Text Generation?

#1254 lieh1203 opened 4 months ago
0
what's the biggest dataset you've tried?

#1253 exnx opened 4 months ago
0
too many .bin files for dataloader, crashed

#1252 exnx closed 4 months ago
0
Assertion Error when Setting pipe_parallel_size or model_parallel_size in GPT-NeoX

#1251 lieh1203 opened 4 months ago
3
For nucleus sampling, top-p sampling appears to happen on the softmax-normalized top-k logits

#1250 j-frei closed 2 months ago
3
batch_input and elapsed time per iteration suddenly slow down during model training

#1248 Yuhanleeee opened 4 months ago
4
Add hf llama to neox conversion

#1247 dmahan93 closed 3 months ago
1
Add Reward Model training

#1246 dmahan93 closed 2 months ago
0
Conversion for CI from self-hosted hardware

#1245 jaimemcc-intel closed 3 months ago
0
Add KTO training

#1244 dmahan93 closed 2 months ago
0
Replace unsafe `pyyaml` loader with `SafeLoader` (#2)

#1243 pixeeai closed 2 months ago
1
Add DPO training

#1242 dmahan93 closed 2 months ago
1
Fix paper reference in init_functions.py

#1241 rasbt closed 4 months ago
2
SFT improvements (labeling fixes, different packing implementations)

#1240 dmahan93 closed 2 months ago
0
Add a chat data preprocessing script

#1239 dmahan93 closed 5 months ago
0
Pr1212

#1238 jahatef closed 5 months ago
0
Add tensor parallelism for RWKV

#1237 jahatef opened 5 months ago
0
Ville dev

#1236 Vmjkom closed 5 months ago
1
Add Transformer Engine's version of RMSNorm and LayerNorm

#1235 lintangsutawika closed 2 months ago
2
fix python version and pytest install

#1234 jahatef closed 5 months ago
5
add workflow_dispatch to gh actions pr so we can run on command

#1233 jahatef closed 5 months ago
0
init changes to README

#1232 jaimemcc-intel closed 5 months ago
0
Cannot convert neox model to HF

#1231 srivassid opened 5 months ago
2
How to set the ffn hidden size parameter in gpt neox

#1230 IronMan-WangJinxi closed 2 months ago
2
Cannot perform inference, be it unconditional. input-file or interactive

#1228 srivassid closed 5 months ago
2
The results of running eval show only 1 digit after decimal point for acc on all tested tasks

#1227 lernerjenny closed 4 months ago
2
Add Torch Profiler Support

#1226 DayOfThePenguin closed 6 months ago
0
Add lora support

#1225 mkerin opened 6 months ago
0
fixed fused_rope naming in JIT + Readme

#1224 R0n12 closed 6 months ago
0
Change python invocation syntax

#1223 jaimemcc-intel closed 5 months ago
0
Small tidying

#1222 yang closed 6 months ago
0
Rwkv pipeline parallelism

#1221 jahatef closed 6 months ago
1

Previous Next