issues
search
kingoflolz
/
mesh-transformer-jax
Model parallel transformers in JAX and Haiku
Apache License 2.0
6.26k
stars
890
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
`Incompatible checkpoints` error when running `slim_model.py`
#213
danyaljj
closed
2 years ago
1
`TypeError: Cannot subclass <class 'typing._SpecialForm'>` in `slim_model.py `
#212
danyaljj
closed
2 years ago
3
GPT-J: perplexity for checkpoints
#211
danyaljj
closed
2 years ago
1
The latest update to PIP breaks installation
#210
markriedl
opened
2 years ago
7
RESOURCE_EXHAUSTED: Failed to allocate request for 256.00MiB (268435456B) on device ordinal 0
#209
Hendler
closed
2 years ago
1
Instruct GPT fine tuning
#208
Jbollenbacher
closed
2 years ago
1
Did you use the splits made by the Pile directly?
#207
boyang9602
closed
2 years ago
0
To run multiple model on one model
#206
whoislimshady
opened
2 years ago
0
Error!
#205
DimIsaev
closed
2 years ago
1
Using `no_repeat_ngram_size` like HF
#204
nikhilanayak
closed
2 years ago
3
Seen floating point types of different precisions in %opt-barrier
#203
mrseeker
opened
2 years ago
1
TpuEmbeddingEngine_WriteParameters not available in this library.
#202
nikhilanayak
closed
2 years ago
11
No TPU found, falling back to CPU
#201
0x7o
closed
2 years ago
1
invalid syntax in to_hf_weights.py and device_train.py
#200
rahimikia
closed
2 years ago
1
how to implement stop sequence in gpt j
#199
whoislimshady
closed
2 years ago
3
Error: AssertionError: Incompatible checkpoints (8,) vs (8, 4096)
#198
ljj430
closed
2 years ago
1
how to restart training
#197
whoislimshady
closed
2 years ago
0
how to speed up the inference time
#196
whoislimshady
closed
2 years ago
2
looking at multiple versions of different packages (slow progress of requirements file)
#194
rahimikia
opened
2 years ago
4
Fine Tuning Dataset Format
#193
teamnetsol
opened
2 years ago
1
having issue in running the model
#192
whoislimshady
closed
2 years ago
0
Is "to_hf_weights.py" specific to "6B_roto_256.json" only?
#191
leejason
opened
2 years ago
0
save_config_to_hf_format()
#190
leejason
opened
2 years ago
0
gpt-neo models are not compatible with this codebase
#189
leejason
opened
2 years ago
0
Fine-tuning
#188
preste-naava
closed
2 years ago
1
How does it work?
#187
ayaka14732
closed
2 years ago
3
Error fine-tuning train
#186
DimIsaev
closed
2 years ago
0
Verifying logic for LR schedule
#185
gupta-abhay
closed
2 years ago
0
HF model does not work on Torch/XLA
#184
TiesdeKok
closed
2 years ago
1
Fix the download link for the weights
#183
versae
closed
2 years ago
1
update links
#182
leogao2
closed
2 years ago
0
read_ckpt getting killed (OOM?)
#181
gaycomputers
closed
2 years ago
4
to_hf_weights.py cpu assertion error
#180
kirchner-jan
closed
2 years ago
0
fix tqdm version conflict
#179
widiba03304
closed
2 years ago
2
Minor requirement conflict on tqdm
#178
widiba03304
closed
2 years ago
1
smaller models
#177
leejason
closed
2 years ago
2
Weight download problem
#176
paramjeet2021
closed
2 years ago
5
sequence_length=2049 or 2048?
#175
leejason
closed
2 years ago
3
jax/haiku versions incompatible?
#174
cifkao
opened
2 years ago
0
Generating random numbers – None PRNGKey error
#173
cifkao
closed
2 years ago
1
to_hf_weights script returns "Failed to allocate" error
#172
guidomeijer
closed
2 years ago
2
Can I do Fine-Tune GPT-J in colab pro?
#171
haositongxue
closed
2 years ago
1
the-eye.eu down - alternative access to GPT-J-6B/step_383500_slim.tar.zstd ?
#170
PhilWicke
closed
2 years ago
13
end sequence possible?
#169
sharaku17
closed
2 years ago
1
add Gopher 230B results
#168
djoldman
closed
2 years ago
1
Colab Demo Notebook Not Working
#167
CircuitGuy
opened
2 years ago
11
How to do pre-train from scratch ?
#166
kamalkraj
closed
2 years ago
1
Pre trained weights for transfer learning
#165
paramjeet2021
closed
2 years ago
1
top-k sampling off by 1 bug
#164
mar-muel
closed
2 years ago
1
Colab version now breaks on "import optax"
#162
mesotron
closed
2 years ago
1
Previous
Next