issues
search
mlcommons
/
training
Reference implementations of MLPerf™ training benchmarks
https://mlcommons.org/en/groups/training
Apache License 2.0
1.62k
stars
560
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Llama2 - LoRA Reference Implementation
#727
rgandikota
closed
4 months ago
3
llama2: fixing DS yaml by adding gradient clipping: 0.3, and small update to …
#726
itayhubara
closed
7 months ago
1
switch to samples_count in logging of llama2_70b_lora
#725
itayhubara
closed
7 months ago
1
MLPerf library version for 4.0 Submission
#724
rgandikota
closed
6 months ago
0
Gradient clipping not working for llama2_70b_lora benchmark
#723
michal2409
closed
4 months ago
1
Alternative method for downloading Llama2 70b
#722
tianmu-li
closed
3 months ago
3
[Stable Diffusion] VAE Moments to image outputs whited out image.
#721
entrpn
closed
3 months ago
2
where is the definition of mlperf_logging
#720
liuxiaoxiao1121
closed
8 months ago
1
OCI runtime create failed
#719
gorleramyasri
closed
3 months ago
1
unable to find image 'mlperf/object_detection'
#718
gorleramyasri
closed
8 months ago
0
Bump gradio from 3.11 to 4.19.2 in /stable_diffusion
#717
dependabot[bot]
opened
9 months ago
1
Add v4.0 suite on Readme
#716
nv-rborkar
closed
7 months ago
1
Bump urllib3 from 1.22 to 1.26.18 in /retired_benchmarks/transformer/tensorflow
#715
dependabot[bot]
opened
9 months ago
1
Bump ip from 1.1.5 to 1.1.9 in /retired_benchmarks/minigo/tensorflow/minigo/oneoffs/joseki
#714
dependabot[bot]
opened
9 months ago
1
Bump axios from 0.19.0 to 0.28.0 in /retired_benchmarks/minigo/tensorflow/minigo/oneoffs/joseki
#713
dependabot[bot]
opened
9 months ago
1
Change dataset download scripts to use Cloudflare buckets directly
#712
morphine00
closed
8 months ago
5
Vulnerability patch: remove joseki from minigo legacy benchmark
#711
pgmpablo157321
closed
9 months ago
1
Potential private information leak in retired benchmark
#710
pgmpablo157321
closed
9 months ago
0
Bump follow-redirects and axios in /retired_benchmarks/minigo/tensorflow/minigo/oneoffs/joseki
#709
dependabot[bot]
opened
9 months ago
1
Bump axios from 0.19.0 to 1.6.0 in /retired_benchmarks/minigo/tensorflow/minigo/oneoffs/joseki
#708
dependabot[bot]
closed
9 months ago
2
Bump gradio from 3.11 to 4.11.0 in /stable_diffusion
#707
dependabot[bot]
closed
9 months ago
2
Bump pillow from 5.2.0 to 10.2.0 in /retired_benchmarks/ssd-v1
#706
dependabot[bot]
opened
9 months ago
1
Bump werkzeug from 0.14.1 to 2.3.8 in /retired_benchmarks/transformer/tensorflow
#705
dependabot[bot]
opened
9 months ago
1
Bump fsevents from 1.2.9 to 1.2.13 in /retired_benchmarks/minigo/tensorflow/minigo/oneoffs/joseki
#704
dependabot[bot]
opened
9 months ago
1
Bump transformers from 4.19.2 to 4.36.0 in /stable_diffusion
#703
dependabot[bot]
opened
9 months ago
1
[SD] v4.0 cleanup and bug fixes
#702
ahmadki
closed
8 months ago
1
Update S3 download instructions
#701
nathanw-mlc
closed
8 months ago
1
[GNN] Reference implementation for GNN node classification
#700
LiSu
closed
8 months ago
8
[SD] log number of samples instead of number of iterations
#699
ahmadki
closed
8 months ago
3
adding initial code drop for llm finetune
#698
itayhubara
closed
8 months ago
2
[Unet3d] - Add infinite data loader to align epochs->samples transition
#697
mmarcinkiewicz
opened
10 months ago
1
MLCube implementation for Stable Diffusion
#696
davidjurado
opened
10 months ago
1
Add MLCube implementation for 3D Unet
#695
davidjurado
opened
11 months ago
1
run stable diffusion see no space left on device error
#694
gaowayne
closed
3 months ago
3
Unable to download tar file in the mlcommons-training-wg-s3 S3 Bucket
#693
ajscalers
closed
3 months ago
2
updated with code to use our instrumentation (some more README update…
#692
rajveerb
closed
11 months ago
1
error run the rnn speech workload, failed to process data after enter docker
#691
gaowayne
closed
3 months ago
5
failed to build object_detection container with below error on FedoraOS37
#690
gaowayne
closed
3 months ago
4
docker run error for image_segmentation/pytorch test following the guide
#689
gaowayne
closed
3 months ago
3
Command line options in bert training
#688
mahmoodn
closed
3 months ago
1
Stable diffusion training test failed at module 'cv2.dnn' has no attribute 'DictValue'
#687
billcsm
closed
8 months ago
2
MLCube implementation for ResNet
#686
davidjurado
opened
1 year ago
1
[SD] unified val file names
#685
ahmadki
closed
8 months ago
1
[UNET3D] Replace epochs with samples
#684
mmarcinkiewicz
opened
1 year ago
1
Add quick SSD demo
#683
davidjurado
closed
8 months ago
2
[SD] fixed number of training samples
#682
ahmadki
closed
8 months ago
1
[SD] a small indentation fix
#681
ahmadki
closed
1 year ago
1
Switch dataset locations from Google Drive to MLCommons Cloud
#680
nathanw-mlc
closed
9 months ago
9
How to run dlrm module with criteo_kaggle dataset?
#679
esharkwang
closed
3 months ago
2
Bump certifi from 2018.4.16 to 2023.7.22 in /retired_benchmarks/transformer/tensorflow
#678
dependabot[bot]
opened
1 year ago
1
Previous
Next