issues
search
salesforce
/
jaxformer
Minimal library to train LLMs on TPU in JAX with pjit().
BSD 3-Clause "New" or "Revised" License
267
stars
35
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Bump transformers from 4.9.2 to 4.36.0 in /preprocess
#32
dependabot[bot]
opened
8 months ago
0
facing issue while running jaxformer locally.
#31
hemanthgattu
opened
1 year ago
0
Bump transformers from 4.9.2 to 4.30.0 in /preprocess
#30
dependabot[bot]
closed
8 months ago
1
added deepspeed inference
#29
mlap1n
closed
1 year ago
1
Weight decay is applied to the layernorm in transformer block
#28
sh0416
opened
1 year ago
0
Log Z term in loss
#26
sh0416
opened
1 year ago
2
Pipeline Parallelism
#25
sh0416
closed
1 year ago
1
Request for training configuration of CodeGen 16B
#24
sh0416
opened
1 year ago
1
Memory to fine-tune 16B model?
#23
glicerico
opened
1 year ago
0
Bump tensorflow from 2.7.2 to 2.11.1 in /preprocess
#22
dependabot[bot]
opened
1 year ago
0
Bump tensorflow-cpu from 2.7.2 to 2.11.1
#21
dependabot[bot]
opened
1 year ago
0
[Mismatched_sizes] Got mismatched_size exception when loading the finetuned model
#20
Jacob-yen
opened
1 year ago
0
TPU finetuned model corrupts
#19
Happylkx
opened
1 year ago
1
Is the BigQuery dataset public available?
#18
HongtaoYang
opened
1 year ago
4
how to make tfrecord with a released tokenizer
#17
HaebinShin
opened
1 year ago
0
can't find paper
#16
glicerico
opened
1 year ago
1
Fine-tuning on conversations (format of conversations)
#15
Eichhof
opened
1 year ago
1
[Suggestion]: Code Notes
#14
Librechain
opened
1 year ago
2
How to load codegen models in jaxformer locally??
#13
harry-stark
opened
1 year ago
1
Bump tensorflow from 2.7.2 to 2.9.3 in /preprocess
#12
dependabot[bot]
closed
1 year ago
1
Bump tensorflow-cpu from 2.7.2 to 2.9.3
#11
dependabot[bot]
closed
1 year ago
1
Out-of-memory running with Deepspeed
#10
calix
opened
1 year ago
2
Can we use Jaxformer for Nvidia A100 GPU currently?
#9
leemgs
opened
1 year ago
3
Add <endoftext> for every lines
#8
PoodleWang
opened
1 year ago
5
Cannot find 16B jax checkpoint
#7
PoodleWang
opened
1 year ago
2
350M Mono not found
#6
bycn
opened
1 year ago
3
ISSUE: thread source out
#5
PoodleWang
closed
1 year ago
1
issue: why you comment this line? Do you still need the bos?
#4
PoodleWang
closed
1 year ago
1
6.1B Config Produces 7B Parameters
#3
xanderdunn
opened
1 year ago
0
Bump tensorflow-cpu from 2.7.0 to 2.7.2
#2
dependabot[bot]
closed
1 year ago
0
Bump tensorflow from 2.5.0 to 2.7.2 in /preprocess
#1
dependabot[bot]
closed
1 year ago
0