issues
search
kingoflolz
/
mesh-transformer-jax
Model parallel transformers in JAX and Haiku
Apache License 2.0
6.29k
stars
892
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Colab version now breaks on "import optax"
#162
mesotron
closed
2 years ago
1
sample data configuration for finetuning
#160
AnassKartit
closed
2 years ago
1
Execute the model in a local machine (or WSL)
#159
Dr-NULL
closed
2 years ago
1
How to run v3-128?
#158
soneo1127
closed
2 years ago
2
Freeze Transformer Weight
#157
ivokun
closed
2 years ago
3
Finetuning and training minimum requirements
#156
AnassKartit
closed
2 years ago
1
limitation min_length=1024
#155
ghost
closed
2 years ago
1
Save failed during checkpoint saving function call
#154
VishalSharmavj
opened
2 years ago
4
Please help with fine-tuning small dataset
#153
ilyakar
closed
2 years ago
2
Access denied for gs://neo-datasets/openwebtext2_new_inputs/eval/openwebtext_9_7_100000.tfrecords
#152
elsakra
closed
2 years ago
0
Update requirements.txt
#151
reouno
closed
2 years ago
1
Colab error with latest requirements.txt
#150
lightyrs
closed
2 years ago
2
Ok so here we have a Writer model an the co-pilot model if im right ? these two can sure be used to build a tool that guide ppl learning code ??
#149
Swoop376
closed
2 years ago
0
Regarding Fine-tuning.
#148
BakingBrains
closed
2 years ago
2
Formatting and Problems while creating tfrecords file for fine tuning gpt. 0 byte tfrecords getting created.
#147
VishalSharmavj
closed
2 years ago
3
Unclear how to best build and run the infrastructure
#146
jleacox
closed
2 years ago
2
Colab demo does not work
#145
JialuZhang
closed
2 years ago
3
lm-evaluation-harness dep bump breaks build
#144
kremlin-
closed
2 years ago
4
unable to extract step_383500.tar.zstd.
#143
davzeng
closed
2 years ago
3
need guide with docker, not an issue
#142
noman00910
closed
2 years ago
4
checkpoint saving args broken/unused
#141
ablacklama
closed
1 year ago
1
added explicit support for tfrecord creation from single file.
#140
ablacklama
closed
3 years ago
1
problem in downloading the slim weights
#139
whoislimshady
closed
3 years ago
5
Add Megatron-Turing NLG 530B numbers
#138
djoldman
closed
2 years ago
0
Google api Exception while finetuning model.
#137
Aryagm
closed
3 years ago
1
create_finetune_tfrecords.py getting killed prematurely.
#136
Aryagm
closed
3 years ago
2
how to print-debugging inside model
#135
jiasenlu
closed
3 years ago
1
How to launch the train.py
#134
jiasenlu
closed
3 years ago
1
Improvement in accuracy
#133
paramjeet2021
closed
3 years ago
1
more than 1024 tokens
#132
Wajih88
closed
3 years ago
1
Fixing config saving bug in to_hf_weights.py + adding pathy to requirements
#131
ablacklama
closed
3 years ago
1
Can't write config of converted hf weights to gs bucket
#130
kevinpl07
closed
3 years ago
2
cannot exit recursive infinite loop in tfrecord_loader.py
#129
bvelker
closed
3 years ago
3
Getting training and validation accuracy while training
#128
albertqjiang
closed
3 years ago
1
Difference between the inputs to GPT-J6B and GPT-2?
#127
BakingBrains
closed
3 years ago
1
Lower memory consumption in Colab demo
#126
vfbd
closed
3 years ago
1
About text generation from keywords
#125
Wajih88
closed
3 years ago
1
Incompatible checkpoints (1,) vs (1, 4096)
#124
niyoushanajmaei
closed
3 years ago
1
Regarding the finetuning of GPT-J6B.
#123
BakingBrains
closed
3 years ago
2
Clarify val_batches
#122
nostalgebraist
closed
3 years ago
1
Add TOC to readme
#121
rozanecm
closed
3 years ago
1
[Colab] Your session crashed after using all available RAM
#120
1234igor
closed
3 years ago
2
Different results while using the web model vs gpt-j-slim weights model on GPU
#119
msakthiganesh
closed
3 years ago
1
Prevent generating excess tokens
#118
msakthiganesh
closed
3 years ago
1
Can I use this script to convert my data to tfrecords?
#117
Aryagm
closed
3 years ago
2
Update howto_finetune.md
#116
StellaAthena
closed
3 years ago
1
Inference speed on TPU
#115
gamcoh
closed
3 years ago
1
Style transfer
#114
jb33k
closed
3 years ago
1
Non-Deterministic Output
#113
jerrygreen
closed
3 years ago
1
Fine tune sequence length
#112
Alexmhack
closed
3 years ago
1
Previous
Next