kingoflolz mesh-transformer-jax issues

kingoflolz / mesh-transformer-jax

Model parallel transformers in JAX and Haiku

Apache License 2.0

6.26k stars 890 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

`Incompatible checkpoints` error when running `slim_model.py`

#213 danyaljj closed 2 years ago
1
`TypeError: Cannot subclass <class 'typing._SpecialForm'>` in `slim_model.py `

#212 danyaljj closed 2 years ago
3
GPT-J: perplexity for checkpoints

#211 danyaljj closed 2 years ago
1
The latest update to PIP breaks installation

#210 markriedl opened 2 years ago
7
RESOURCE_EXHAUSTED: Failed to allocate request for 256.00MiB (268435456B) on device ordinal 0

#209 Hendler closed 2 years ago
1
Instruct GPT fine tuning

#208 Jbollenbacher closed 2 years ago
1
Did you use the splits made by the Pile directly?

#207 boyang9602 closed 2 years ago
0
To run multiple model on one model

#206 whoislimshady opened 2 years ago
0
Error!

#205 DimIsaev closed 2 years ago
1
Using `no_repeat_ngram_size` like HF

#204 nikhilanayak closed 2 years ago
3
Seen floating point types of different precisions in %opt-barrier

#203 mrseeker opened 2 years ago
1
TpuEmbeddingEngine_WriteParameters not available in this library.

#202 nikhilanayak closed 2 years ago
11
No TPU found, falling back to CPU

#201 0x7o closed 2 years ago
1
invalid syntax in to_hf_weights.py and device_train.py

#200 rahimikia closed 2 years ago
1
how to implement stop sequence in gpt j

#199 whoislimshady closed 2 years ago
3
Error: AssertionError: Incompatible checkpoints (8,) vs (8, 4096)

#198 ljj430 closed 2 years ago
1
how to restart training

#197 whoislimshady closed 2 years ago
0
how to speed up the inference time

#196 whoislimshady closed 2 years ago
2
looking at multiple versions of different packages (slow progress of requirements file)

#194 rahimikia opened 2 years ago
4
Fine Tuning Dataset Format

#193 teamnetsol opened 2 years ago
1
having issue in running the model

#192 whoislimshady closed 2 years ago
0
Is "to_hf_weights.py" specific to "6B_roto_256.json" only?

#191 leejason opened 2 years ago
0
save_config_to_hf_format()

#190 leejason opened 2 years ago
0
gpt-neo models are not compatible with this codebase

#189 leejason opened 2 years ago
0
Fine-tuning

#188 preste-naava closed 2 years ago
1
How does it work?

#187 ayaka14732 closed 2 years ago
3
Error fine-tuning train

#186 DimIsaev closed 2 years ago
0
Verifying logic for LR schedule

#185 gupta-abhay closed 2 years ago
0
HF model does not work on Torch/XLA

#184 TiesdeKok closed 2 years ago
1
Fix the download link for the weights

#183 versae closed 2 years ago
1
update links

#182 leogao2 closed 2 years ago
0
read_ckpt getting killed (OOM?)

#181 gaycomputers closed 2 years ago
4
to_hf_weights.py cpu assertion error

#180 kirchner-jan closed 2 years ago
0
fix tqdm version conflict

#179 widiba03304 closed 2 years ago
2
Minor requirement conflict on tqdm

#178 widiba03304 closed 2 years ago
1
smaller models

#177 leejason closed 2 years ago
2
Weight download problem

#176 paramjeet2021 closed 2 years ago
5
sequence_length=2049 or 2048?

#175 leejason closed 2 years ago
3
jax/haiku versions incompatible?

#174 cifkao opened 2 years ago
0
Generating random numbers – None PRNGKey error

#173 cifkao closed 2 years ago
1
to_hf_weights script returns "Failed to allocate" error

#172 guidomeijer closed 2 years ago
2
Can I do Fine-Tune GPT-J in colab pro?

#171 haositongxue closed 2 years ago
1
the-eye.eu down - alternative access to GPT-J-6B/step_383500_slim.tar.zstd ?

#170 PhilWicke closed 2 years ago
13
end sequence possible?

#169 sharaku17 closed 2 years ago
1
add Gopher 230B results

#168 djoldman closed 2 years ago
1
Colab Demo Notebook Not Working

#167 CircuitGuy opened 2 years ago
11
How to do pre-train from scratch ?

#166 kamalkraj closed 2 years ago
1
Pre trained weights for transfer learning

#165 paramjeet2021 closed 2 years ago
1
top-k sampling off by 1 bug

#164 mar-muel closed 2 years ago
1
Colab version now breaks on "import optax"

#162 mesotron closed 2 years ago
1

Previous Next