issues
search
HomebrewNLP
/
Olmax
HomebrewNLP in JAX flavour for maintable TPU-Training
BSD 2-Clause "Simplified" License
45
stars
5
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
feat(model): add alibi conv
#56
ClashLuke
closed
2 years ago
1
Complex Momentum
#55
ClashLuke
closed
1 year ago
2
Alternative Losses
#54
ClashLuke
opened
2 years ago
2
Balance update weights of depthwise vs. pointwise convolution
#53
ClashLuke
closed
2 years ago
2
Hierarchical network
#52
ClashLuke
closed
2 years ago
3
Transfer weights across size
#51
ClashLuke
closed
2 years ago
2
Hierarchical Network
#50
ClashLuke
closed
1 year ago
5
Long-Range-Arena Evaluation
#49
ClashLuke
opened
2 years ago
0
ALiBi Convolution
#48
ClashLuke
closed
2 years ago
1
Rmsprop grafting
#47
ClashLuke
closed
2 years ago
1
Gradient Noise
#46
ClashLuke
opened
2 years ago
0
Retrieval Augmented Causal Generation
#45
ClashLuke
opened
2 years ago
0
Encoder-Decoder Architecture
#44
ClashLuke
opened
2 years ago
0
Initialize deep model from shallow model
#43
ClashLuke
closed
2 years ago
1
Alternative Sampling Methods
#42
ClashLuke
closed
2 years ago
2
Allow broken TPUs + Fix inference
#41
ClashLuke
closed
2 years ago
1
Multi-Host Scaling
#40
ClashLuke
closed
2 years ago
1
Typical Sampling
#39
ClashLuke
closed
2 years ago
1
Reuse ("donate") Buffers
#38
ClashLuke
closed
2 years ago
1
Reuse Parameter-Buffers
#37
ClashLuke
closed
2 years ago
1
Shampoo Refactor
#36
ClashLuke
closed
2 years ago
1
Optimizer Grafting
#35
ClashLuke
closed
2 years ago
2
Shampoo Optimizer
#34
ClashLuke
closed
2 years ago
4
Long-Context Model
#33
ClashLuke
closed
1 year ago
2
Automated Eval-Demo Update
#32
ClashLuke
opened
2 years ago
0
Automated Long-Running Experiments
#31
ClashLuke
opened
2 years ago
0
Automated Integration Tests
#30
ClashLuke
opened
2 years ago
0
Long-Context Experiments
#29
ClashLuke
opened
2 years ago
0
Pretrained Embeddings, Stop at EOS, Untied Embeddings
#28
ClashLuke
closed
2 years ago
1
Checkpoint, Restore and Inference
#27
ClashLuke
closed
2 years ago
1
Reduce Compile-Time
#26
ClashLuke
closed
1 year ago
3
Chunked Cross-Entropy
#25
ClashLuke
closed
2 years ago
0
"Resume" option for tokenizers
#23
ClashLuke
opened
2 years ago
0
Release pretrained weights
#22
ClashLuke
opened
2 years ago
0
Language-Model Evaluation
#21
ClashLuke
opened
2 years ago
0
Frontend
#20
ClashLuke
opened
2 years ago
0
Web API
#19
ClashLuke
closed
2 years ago
1
Inference CLI
#18
ClashLuke
closed
2 years ago
1
Finalize checkpoint/restore
#17
ClashLuke
closed
2 years ago
1
Stabilize MoE
#16
ClashLuke
opened
2 years ago
0
Shampoo Optimizer
#15
ClashLuke
closed
2 years ago
1
Momentum Quantization
#14
ClashLuke
closed
2 years ago
1
Scaling
#13
ClashLuke
closed
2 years ago
0
Non-Autoregressive Generation
#12
ClashLuke
opened
2 years ago
2
Image Classification
#11
ClashLuke
opened
2 years ago
0
Tokenizing Phonetics
#10
ClashLuke
opened
2 years ago
7
Audio Modelling
#9
ClashLuke
opened
2 years ago
0
Explicit Memory
#8
ClashLuke
opened
2 years ago
0
Faster QRNN
#7
ClashLuke
closed
1 year ago
11
MoE + Weight Sharing
#6
ClashLuke
opened
2 years ago
0
Previous
Next