issues
search
mosaicml
/
llm-foundry
LLM training code for Databricks foundation models
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
Apache License 2.0
3.84k
stars
503
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Make the exceptions serializable
#1239
dakinggg
closed
1 month ago
0
Add retries to downloads in convert_text_to_mds.py
#1238
irenedea
closed
1 month ago
1
[WIP] use_remote_uploader_v2
#1237
bigning
opened
1 month ago
0
Configurable submesh
#1236
dakinggg
closed
1 month ago
0
Fix tuple typing
#1235
dakinggg
closed
1 month ago
0
Move MLFlow dataset outside of log_config
#1234
KuuCi
closed
1 month ago
1
Fix Mosaic Logger custom exception serialization
#1233
milocress
closed
1 month ago
0
Fixing the state.timestamp.batch.value issue in loss v len callback
#1232
ShashankMosaicML
closed
1 month ago
0
LLaMA PRO training resume problem
#1231
germanjke
opened
1 month ago
6
Fix attr error for attention_classes when using act ckpt
#1230
cli99
closed
1 month ago
0
Modularize backbone class and block creation
#1229
dakinggg
closed
1 month ago
0
Quick patch to check that Dataset Keys contain non-None Values
#1228
KuuCi
closed
1 month ago
1
Make config/class properties on ComposerMPTForCausalLM
#1227
dakinggg
closed
1 month ago
0
Loss v len callback
#1226
ShashankMosaicML
closed
1 month ago
0
Add user error superclass
#1225
milocress
closed
1 month ago
2
Modularize components of megablocks layer builder
#1224
dakinggg
closed
1 month ago
3
Decompression tokens
#1223
milocress
closed
1 month ago
0
add error when chat template fails
#1222
milocress
closed
1 month ago
0
Finetuning does not work on nightly
#1221
eldarkurtic
closed
1 month ago
2
Conversion Sharded -> Monolithic checkpoint
#1220
pretidav
opened
1 month ago
1
Update readme to clarify flash-attn and TE installs
#1219
snarayan21
closed
1 month ago
0
Add example eval scripts for dbrx PT sizes
#1218
aspfohl
closed
1 month ago
2
Add te for torch 2.4.0
#1217
j316chuck
closed
1 month ago
0
Fix dmoe tests GPU OOM
#1216
snarayan21
closed
1 month ago
0
Update Dockerfile
#1215
j316chuck
closed
1 month ago
0
Dbfs HF
#1214
KuuCi
closed
3 weeks ago
3
Removed debugging code in tests
#1213
dakinggg
closed
1 month ago
0
Adding HF source Path in for DBFS
#1212
KuuCi
closed
1 month ago
0
Using self.shift_labels instead of self.model.transformer.shift_label in the loss function.
#1211
ShashankMosaicML
closed
1 month ago
0
Added torch_dmoe defaults, bug fixes for 2D inputs
#1210
snarayan21
closed
1 month ago
1
Add fc to HF export
#1209
dakinggg
closed
1 month ago
0
Update setup.py
#1208
j316chuck
closed
1 month ago
0
Chuck/gpu build te win
#1207
j316chuck
closed
1 month ago
1
Build Te
#1206
j316chuck
closed
1 month ago
0
Mvpatel2000/te image stable
#1205
mvpatel2000
closed
1 month ago
0
TransformerEngine Image Build
#1204
mvpatel2000
closed
1 month ago
0
[don't merge lol] test pyhook
#1203
milocress
closed
1 month ago
0
Clearer error message for unknown example type
#1202
milocress
closed
1 month ago
0
Make `fc_type` a dict to pass fc kwargs through
#1201
snarayan21
closed
1 month ago
1
commit change
#1200
j316chuck
closed
1 month ago
0
Allow EOS token for finetuning
#1199
jimwu6
closed
1 month ago
3
Removing rich install
#1198
jjanezhang
closed
1 month ago
0
MoE with FSDP
#1197
Muennighoff
closed
1 month ago
1
Pass FC type along for all FFN types
#1196
dakinggg
closed
1 month ago
0
Streaming version bump to 0.7.6
#1195
snarayan21
closed
1 month ago
0
Log exception on inactivity callback
#1194
jjanezhang
closed
1 month ago
0
fix eval
#1193
milocress
closed
1 month ago
0
Add te
#1192
j316chuck
closed
1 month ago
0
test te once more
#1191
j316chuck
closed
1 month ago
0
Remove to_container
#1190
dakinggg
closed
1 month ago
0
Previous
Next