issues
search
kingoflolz
/
mesh-transformer-jax
Model parallel transformers in JAX and Haiku
Apache License 2.0
6.29k
stars
892
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
lm_eval missing
#264
falv706
opened
1 month ago
0
6b.eleuther.ai mystic model is down for GPT-J-6B.
#263
Gertie01
opened
8 months ago
0
What about a Hugging Face Spaces demo so we can test this?
#262
Gertie01
opened
11 months ago
0
Web demo must be fixed.
#261
Gertie01
opened
11 months ago
0
About rope embedding
#260
eyuansu62
opened
1 year ago
0
Framework
#259
T3fo0ls7766
opened
1 year ago
0
Finetuning Hardware Recomendations
#258
greyweb
opened
1 year ago
0
Discrepancy between results reported in this repo and in the NeoX paper
#257
ghost
closed
1 year ago
2
How to infer with GPT-J on TPU_driver0.2 or nightly?
#256
mosmos6
closed
1 year ago
1
TPU-V4
#255
wimjan123
opened
1 year ago
11
Quantization for training / finetuning
#254
torphix
opened
1 year ago
0
Which version of Python does this work with?
#253
chrisbward
opened
1 year ago
2
tpu_driver0.1 is not initialized on colab (cannot infer with GPT-J on Colab) [Again]
#252
mosmos6
closed
1 year ago
7
Download Link for Model Weights in howto_finetune.md is broken
#251
torakoneko
opened
1 year ago
2
GPT-J used in "Domain-Specific Text Generation for Machine Translation"
#250
ymoslem
closed
6 months ago
0
First commit
#249
Kasties
closed
1 year ago
0
Fine-tuning on conversations (format of conversations)
#248
Eichhof
opened
1 year ago
2
Minor readme updates to fix linkrot and make it clear which links download files.
#247
StellaAthena
closed
1 year ago
0
Resolving dependency issues
#246
rinapch
opened
1 year ago
6
Could not find a version that satisfies the requirement ray[default]==1.4.1
#245
Maxim-Mazurok
opened
1 year ago
5
Do you have any plans to create the open source version of chatGPT ?
#244
stc2001
opened
1 year ago
2
Can we please get a quickstart guide?
#243
tswallen
opened
1 year ago
2
TPU not found on VM (jax version 0.2.16)
#242
Eichhof
opened
1 year ago
0
Web demo is not launching any results. Might be disconnected from the model.
#241
Gertie01
opened
1 year ago
43
The PILE dataset is full of racist content and thus GPT-J produces racist thinking.
#240
azeemh
opened
1 year ago
2
Project dependencies may have API risk issues
#239
PyDeps
opened
2 years ago
0
Dead link to weights?
#238
samacqua
closed
2 years ago
1
TPU Instance Creation
#237
zzj0402
opened
2 years ago
2
Update the readme with required and recommended hardware list
#236
sxiii
closed
2 years ago
3
on number of training tokens of gpt-j-6b and gpt-neox-20b
#235
xiaoda99
opened
2 years ago
0
Is the treatment of embedding bias in to_hf_weights.py correct?
#234
xiaoda99
closed
2 years ago
2
AttributeError: module 'jaxlib.pocketfft' has no attribute 'pocketfft'
#233
umm-maybe
opened
2 years ago
4
GPT-J-6B Inference Demo notebook giving errors when cores_per_replica=1
#232
batrasakshi
closed
2 years ago
1
Typo in 'to_hf_weights.py '
#231
AmoArt
opened
2 years ago
1
Google Colab Error: optax is throwing an attribute error.
#230
prajjwalgeek
opened
2 years ago
2
[Feature Request] Multilingual assistance.
#229
phly95
opened
2 years ago
0
How to stop model generating
#228
jingrongchen
opened
2 years ago
1
Finetuning GPT Neo 20B Using TPU V3-8s
#227
nikhilanayak
opened
2 years ago
0
6b.eleuther.ai mystic model is down for GPT-J-6B
#226
orionnelson
opened
2 years ago
4
TypeError: __init__() takes 2 positional arguments but 4 were given
#225
ghost
opened
2 years ago
1
Training data format for generating Scenario based MCQ's
#224
shrey10926
closed
2 years ago
2
1
#223
PeezoSlug
closed
2 years ago
0
TypeError: Cannot subclass <class 'typing._SpecialForm'> while fine tuning
#222
samyakai
opened
2 years ago
9
AttributeError: module 'jax.random' has no attribute 'KeyArray' while fine tuning.
#221
samyakai
closed
2 years ago
15
CausalTransformerV2 or CausalTransformer?
#220
leejason
opened
2 years ago
0
GPT-J inference on TPU
#219
airesearch38
opened
2 years ago
3
training stuck at validation step 1
#218
Selimonder
opened
2 years ago
4
Can "slim_model.py" work with "d_model" as 768?
#217
leejason
opened
2 years ago
0
Running on Colab TPU only gives random words and nonsensical outputs
#216
JohnnyRacer
closed
2 years ago
2
`OSError: libmkl_intel_lp64.so.1: cannot open shared object file` when using `to_hf_weights.py`
#215
danyaljj
closed
2 years ago
1
Next