High Performance Computing: CUDA and GCP

obriensystems commented 1 year ago

see https://github.com/ObrienlabsDev/machine-learning/issues/10
Use Cases

Tensor cores have 3.5x the performance on NVidia GPUs than cuda cores

LLM and Generative AI

Implementation

https://github.com/ObrienlabsDev/blog/wiki/CUDA-based-%E2%80%90-High-Performance-Computing-%E2%80%90-LLM-Training-%E2%80%90-Ground-to-GCP-Cloud-Hybrid
Turn off organization policies compute.vmExternalIpAccess and compute.requireShieldedVm before - see https://github.com/GoogleCloudPlatform/pubsec-declarative-toolkit/issues/426 and https://github.com/GoogleCloudPlatform/pbmm-on-gcp-onboarding/issues/252 for details

see https://github.com/obrienlabs/CUDA-Programs/tree/main/Chapter01/gpusum as part of the book from Richard Ansorge of University of Cambridge https://www.cambridge.org/core/books/programming-in-parallel-with-cuda/C43652A69033C25AD6933368CDBE084C

obriensystems commented 10 months ago

-tc tensor cores on dual 4090

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker images list
REPOSITORY   TAG       IMAGE ID   CREATED   SIZE
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-8c7dc11e-6825-08c1-f05d-5cff6d4ad6db)
GPU 1: NVIDIA GeForce RTX 4090 (UUID: GPU-511a0768-717f-2b3b-0133-b49b7d315929)
Using compare file: compare.ptx
Burning for 120 seconds.
10.8%  proc'd: 1463 (153538 Gflop/s) - 1540 (153827 Gflop/s)   errors: 0 - 0   temps: 58 C - 55 C
        Summary at:   Thu Aug 24 03:49:31 UTC 2023

21.7%  proc'd: 3311 (153176 Gflop/s) - 3388 (153534 Gflop/s)   errors: 0 - 0   temps: 58 C - 53 C
        Summary at:   Thu Aug 24 03:49:44 UTC 2023

32.5%  proc'd: 5159 (141268 Gflop/s) - 5159 (153899 Gflop/s)   errors: 0 - 0   temps: 55 C - 56 C
        Summary at:   Thu Aug 24 03:49:57 UTC 2023

43.3%  proc'd: 6930 (153622 Gflop/s) - 7007 (153653 Gflop/s)   errors: 0 - 0   temps: 59 C - 57 C
        Summary at:   Thu Aug 24 03:50:10 UTC 2023

53.3%  proc'd: 8624 (153068 Gflop/s) - 8701 (153686 Gflop/s)   errors: 0 - 0   temps: 60 C - 58 C
        Summary at:   Thu Aug 24 03:50:22 UTC 2023

64.2%  proc'd: 10472 (153347 Gflop/s) - 10472 (153499 Gflop/s)   errors: 0 - 0   temps: 61 C - 58 C
        Summary at:   Thu Aug 24 03:50:35 UTC 2023

75.0%  proc'd: 12243 (153239 Gflop/s) - 12320 (153441 Gflop/s)   errors: 0 - 0   temps: 61 C - 59 C
        Summary at:   Thu Aug 24 03:50:48 UTC 2023

85.8%  proc'd: 14091 (152801 Gflop/s) - 14091 (153219 Gflop/s)   errors: 0 - 0   temps: 61 C - 59 C
        Summary at:   Thu Aug 24 03:51:01 UTC 2023

96.7%  proc'd: 15862 (153500 Gflop/s) - 15939 (153400 Gflop/s)   errors: 0 - 0   temps: 56 C - 59 C
        Summary at:   Thu Aug 24 03:51:14 UTC 2023

100.0%  proc'd: 16555 (153201 Gflop/s) - 16632 (153399 Gflop/s)   errors: 0 - 0   temps: 61 C - 59 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 24563 MB of memory (22646 MB available, using 20381 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 77 iterations
Freed memory for dev 0
Uninitted cublas
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 1 with 24563 MB of memory (22646 MB available, using 20381 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 77 iterations
Freed memory for dev 1
Uninitted cublas
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

13900K - 160Gb ram, two MSI 4090 suprim liquid X

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker run --rm --gpus all
"docker run" requires at least 1 argument.
See 'docker run --help'.

Usage:  docker run [OPTIONS] IMAGE [COMMAND] [ARG...]

Create and run a new container from an image

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-8c7dc11e-6825-08c1-f05d-5cff6d4ad6db)
GPU 1: NVIDIA GeForce RTX 4090 (UUID: GPU-511a0768-717f-2b3b-0133-b49b7d315929)
Using compare file: compare.ptx
Burning for 120 seconds.
10.8%  proc'd: 616 (59111 Gflop/s) - 539 (58940 Gflop/s)   errors: 0 - 0   temps: 52 C - 50 C
        Summary at:   Sat Aug 26 01:03:14 UTC 2023

21.7%  proc'd: 1309 (58964 Gflop/s) - 1232 (59059 Gflop/s)   errors: 0 - 0   temps: 54 C - 52 C
        Summary at:   Sat Aug 26 01:03:27 UTC 2023

32.5%  proc'd: 2002 (58968 Gflop/s) - 1925 (58968 Gflop/s)   errors: 0 - 0   temps: 56 C - 54 C
        Summary at:   Sat Aug 26 01:03:40 UTC 2023

43.3%  proc'd: 2618 (58804 Gflop/s) - 2695 (58906 Gflop/s)   errors: 0 - 0   temps: 59 C - 57 C
        Summary at:   Sat Aug 26 01:03:53 UTC 2023

53.3%  proc'd: 3234 (58611 Gflop/s) - 3311 (58749 Gflop/s)   errors: 0 - 0   temps: 60 C - 58 C
        Summary at:   Sat Aug 26 01:04:05 UTC 2023

64.2%  proc'd: 3927 (58764 Gflop/s) - 4004 (58689 Gflop/s)   errors: 0 - 0   temps: 60 C - 59 C
        Summary at:   Sat Aug 26 01:04:18 UTC 2023

75.0%  proc'd: 4620 (58551 Gflop/s) - 4697 (58671 Gflop/s)   errors: 0 - 0   temps: 61 C - 60 C
        Summary at:   Sat Aug 26 01:04:31 UTC 2023

85.8%  proc'd: 5313 (58612 Gflop/s) - 5390 (58632 Gflop/s)   errors: 0 - 0   temps: 62 C - 60 C
        Summary at:   Sat Aug 26 01:04:44 UTC 2023

96.7%  proc'd: 6006 (58576 Gflop/s) - 6083 (58671 Gflop/s)   errors: 0 - 0   temps: 62 C - 60 C
        Summary at:   Sat Aug 26 01:04:57 UTC 2023

100.0%  proc'd: 6314 (58557 Gflop/s) - 6314 (58565 Gflop/s)   errors: 0 - 0   temps: 62 C - 59 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 1 with 24563 MB of memory (22646 MB available, using 20381 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 77 iterations
Freed memory for dev 1
Uninitted cublas
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 24563 MB of memory (22646 MB available, using 20381 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 77 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ vi Dockerfile

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker build -t gpu-burn .
[+] Building 1.6s (14/14) FINISHED                                                                                                                       docker:default
 => [internal] load .dockerignore                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                    0.0s
 => [internal] load build definition from Dockerfile                                                                                                               0.0s
 => => transferring dockerfile: 406B                                                                                                                               0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                                                         0.3s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                                                           0.3s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8148c                             0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b04023b30e                           0.0s
 => [internal] load build context                                                                                                                                  0.0s
 => => transferring context: 2.58kB                                                                                                                                0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                                                            0.0s
 => [builder 3/4] COPY . /build/                                                                                                                                   0.0s
 => [builder 4/4] RUN make                                                                                                                                         1.2s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                                                 0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                                                              0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                                                              0.0s
 => exporting to image                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                            0.0s
 => => writing image sha256:9ec0880bd04ada3e541b95a8f71dfc447982a701b1336d53a594c68955263fc8                                                                       0.0s
 => => naming to docker.io/library/gpu-burn                                                                                                                        0.0s
WARNING: buildx: failed to read current commit information with git rev-parse --is-inside-work-tree

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-8c7dc11e-6825-08c1-f05d-5cff6d4ad6db)
GPU 1: NVIDIA GeForce RTX 4090 (UUID: GPU-511a0768-717f-2b3b-0133-b49b7d315929)
Using compare file: compare.ptx
Burning for 120 seconds.
28.3%  proc'd: 0 (0 Gflop/s) - 37 (1218 Gflop/s)   errors: 0 - 0   temps: 36 C - 45 C
        Summary at:   Sat Aug 26 01:06:32 UTC 2023

41.7%  proc'd: 37 (1202 Gflop/s) - 37 (1218 Gflop/s)   errors: 0 - 0   temps: 35 C - 45 C
        Summary at:   Sat Aug 26 01:06:48 UTC 2023

54.2%  proc'd: 37 (1202 Gflop/s) - 37 (1218 Gflop/s)   errors: 0 - 0   temps: 35 C - 46 C
        Summary at:   Sat Aug 26 01:07:03 UTC 2023

66.7%  proc'd: 74 (1257 Gflop/s) - 74 (1272 Gflop/s)   errors: 0 - 0   temps: 35 C - 47 C
        Summary at:   Sat Aug 26 01:07:18 UTC 2023

79.2%  proc'd: 74 (1257 Gflop/s) - 74 (1272 Gflop/s)   errors: 0 - 0   temps: 35 C - 48 C
        Summary at:   Sat Aug 26 01:07:33 UTC 2023

92.5%  proc'd: 111 (1257 Gflop/s) - 111 (1267 Gflop/s)   errors: 0 - 0   temps: 35 C - 49 C
        Summary at:   Sat Aug 26 01:07:49 UTC 2023

100.0%  proc'd: 111 (1257 Gflop/s) - 111 (1267 Gflop/s)   errors: 0 - 0   temps: 35 C - 49 C
Killing processes with SIGTERM (soft kill)

Killing processes with SIGKILL (force kill)
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ vi Dockerfile

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker build -t gpu-burn .
[+] Building 0.7s (14/14) FINISHED                                                                                                                       docker:default
 => [internal] load .dockerignore                                                                                                                                  0.0s
 => => transferring context: 2B                                                                                                                                    0.0s
 => [internal] load build definition from Dockerfile                                                                                                               0.0s
 => => transferring dockerfile: 407B                                                                                                                               0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                                                         0.5s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                                                           0.5s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8148c                             0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b04023b30e                           0.0s
 => [internal] load build context                                                                                                                                  0.0s
 => => transferring context: 2.58kB                                                                                                                                0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                                                            0.0s
 => CACHED [builder 3/4] COPY . /build/                                                                                                                            0.0s
 => CACHED [builder 4/4] RUN make                                                                                                                                  0.0s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                                                 0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                                                              0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                                                              0.0s
 => exporting to image                                                                                                                                             0.0s
 => => exporting layers                                                                                                                                            0.0s
 => => writing image sha256:97d6add3f01168da86cdbdf116a2c925cd1d5fb360cef154bd2544b47834401f                                                                       0.0s
 => => naming to docker.io/library/gpu-burn                                                                                                                        0.0s
WARNING: buildx: failed to read current commit information with git rev-parse --is-inside-work-tree

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

michael@13900b MINGW64 /c/wse_github/gpu-burn
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA GeForce RTX 4090 (UUID: GPU-8c7dc11e-6825-08c1-f05d-5cff6d4ad6db)
GPU 1: NVIDIA GeForce RTX 4090 (UUID: GPU-511a0768-717f-2b3b-0133-b49b7d315929)
Using compare file: compare.ptx
Burning for 120 seconds.
10.8%  proc'd: 1617 (153802 Gflop/s) - 1617 (153724 Gflop/s)   errors: 0 - 0   temps: 50 C - 61 C
        Summary at:   Sat Aug 26 01:09:39 UTC 2023

21.7%  proc'd: 3465 (153557 Gflop/s) - 3388 (153475 Gflop/s)   errors: 0 - 0   temps: 53 C - 60 C
        Summary at:   Sat Aug 26 01:09:52 UTC 2023

32.5%  proc'd: 5313 (152977 Gflop/s) - 5236 (153847 Gflop/s)   errors: 0 - 0   temps: 55 C - 59 C
        Summary at:   Sat Aug 26 01:10:05 UTC 2023

43.3%  proc'd: 7084 (153625 Gflop/s) - 7007 (153476 Gflop/s)   errors: 0 - 0   temps: 56 C - 58 C
        Summary at:   Sat Aug 26 01:10:18 UTC 2023

53.3%  proc'd: 8701 (153242 Gflop/s) - 8778 (154204 Gflop/s)   errors: 0 - 0   temps: 58 C - 58 C
        Summary at:   Sat Aug 26 01:10:30 UTC 2023

64.2%  proc'd: 10549 (153293 Gflop/s) - 10549 (154332 Gflop/s)   errors: 0 - 0   temps: 55 C - 58 C
        Summary at:   Sat Aug 26 01:10:43 UTC 2023

75.0%  proc'd: 12320 (152903 Gflop/s) - 12397 (154304 Gflop/s)   errors: 0 - 0   temps: 60 C - 58 C
        Summary at:   Sat Aug 26 01:10:56 UTC 2023

85.8%  proc'd: 14168 (153315 Gflop/s) - 14168 (154330 Gflop/s)   errors: 0 - 0   temps: 60 C - 57 C
        Summary at:   Sat Aug 26 01:11:09 UTC 2023

96.7%  proc'd: 16016 (152606 Gflop/s) - 16016 (154242 Gflop/s)   errors: 0 - 0   temps: 60 C - 57 C
        Summary at:   Sat Aug 26 01:11:22 UTC 2023

100.0%  proc'd: 16709 (152998 Gflop/s) - 16709 (153982 Gflop/s)   errors: 0 - 0   temps: 60 C - 57 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 1 with 24563 MB of memory (22646 MB available, using 20381 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 77 iterations
Freed memory for dev 1
Uninitted cublas
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 24563 MB of memory (22646 MB available, using 20381 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 77 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

obriensystems commented 10 months ago

RTX-A4000


micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
12.5%  proc'd: 100 (11643 Gflop/s)   errors: 0   temps: 47 C
        Summary at:   Wed Aug 23 00:34:50 UTC 2023

25.0%  proc'd: 300 (11103 Gflop/s)   errors: 0   temps: 60 C
        Summary at:   Wed Aug 23 00:35:05 UTC 2023

37.5%  proc'd: 450 (10962 Gflop/s)   errors: 0   temps: 69 C
        Summary at:   Wed Aug 23 00:35:20 UTC 2023

50.0%  proc'd: 550 (10682 Gflop/s)   errors: 0   temps: 75 C
        Summary at:   Wed Aug 23 00:35:35 UTC 2023

62.5%  proc'd: 700 (10404 Gflop/s)   errors: 0   temps: 78 C
        Summary at:   Wed Aug 23 00:35:50 UTC 2023

75.0%  proc'd: 850 (10254 Gflop/s)   errors: 0   temps: 79 C
        Summary at:   Wed Aug 23 00:36:05 UTC 2023

85.8%  proc'd: 1000 (10183 Gflop/s)   errors: 0   temps: 80 C
        Summary at:   Wed Aug 23 00:36:18 UTC 2023

100.0%  proc'd: 1150 (10105 Gflop/s)   errors: 0   temps: 81 C
        Summary at:   Wed Aug 23 00:36:35 UTC 2023

100.0%  proc'd: 1200 (10082 Gflop/s)   errors: 0   temps: 81 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 16375 MB of memory (14897 MB available, using 13407 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 50 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 2.0s (14/14) FINISHED                                                                                        docker:default
 => [internal] load build definition from Dockerfile                                                                                0.0s
 => => transferring dockerfile: 413B                                                                                                0.0s
 => [internal] load .dockerignore                                                                                                   0.0s
 => => transferring context: 2B                                                                                                     0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                          0.5s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                            0.6s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8  0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b040  0.0s
 => [internal] load build context                                                                                                   0.0s
 => => transferring context: 2.59kB                                                                                                 0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                             0.0s
 => [builder 3/4] COPY . /build/                                                                                                    0.1s
 => [builder 4/4] RUN make                                                                                                          1.1s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                  0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                               0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                               0.0s
 => exporting to image                                                                                                              0.0s
 => => exporting layers                                                                                                             0.0s
 => => writing image sha256:727a7010d092c01d806b4975d2159754510066eb6da744ad512c9dd92df7fed8                                        0.0s
 => => naming to docker.io/library/gpu-burn                                                                                         0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
71.7%  proc'd: 24 (306 Gflop/s)   errors: 0   temps: 65 C
        Summary at:   Wed Aug 23 00:48:26 UTC 2023

83.3%  proc'd: 24 (306 Gflop/s)   errors: 0   temps: 66 C
        Summary at:   Wed Aug 23 00:48:40 UTC 2023

95.8%  proc'd: 24 (306 Gflop/s)   errors: 0   temps: 68 C
        Summary at:   Wed Aug 23 00:48:55 UTC 2023

100.0%  proc'd: 24 (306 Gflop/s)   errors: 0   temps: 69 C
Killing processes with SIGTERM (soft kill)

Killing processes with SIGKILL (force kill)
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 1.5s (14/14) FINISHED                                                                                        docker:default
 => [internal] load .dockerignore                                                                                                   0.0s
 => => transferring context: 2B                                                                                                     0.0s
 => [internal] load build definition from Dockerfile                                                                                0.0s
 => => transferring dockerfile: 448B                                                                                                0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                          0.3s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                            0.3s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8  0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b040  0.0s
 => [internal] load build context                                                                                                   0.0s
 => => transferring context: 2.62kB                                                                                                 0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                             0.0s
 => [builder 3/4] COPY . /build/                                                                                                    0.0s
 => [builder 4/4] RUN make                                                                                                          1.1s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                  0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                               0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                               0.0s
 => exporting to image                                                                                                              0.0s
 => => exporting layers                                                                                                             0.0s
 => => writing image sha256:6b5d365fd9cd38a493e55b2226fbb4b5f9960fa47b1717d95c6bf553d8fc6f74                                        0.0s
 => => naming to docker.io/library/gpu-burn                                                                                         0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
10.8%  proc'd: 400 (40001 Gflop/s)   errors: 0   temps: 55 C
        Summary at:   Wed Aug 23 00:51:28 UTC 2023

21.7%  proc'd: 850 (38789 Gflop/s)   errors: 0   temps: 66 C
        Summary at:   Wed Aug 23 00:51:41 UTC 2023

32.5%  proc'd: 1300 (38000 Gflop/s)   errors: 0   temps: 71 C
        Summary at:   Wed Aug 23 00:51:54 UTC 2023

43.3%  proc'd: 1750 (36969 Gflop/s)   errors: 0   temps: 77 C
        Summary at:   Wed Aug 23 00:52:07 UTC 2023

53.3%  proc'd: 2150 (36430 Gflop/s)   errors: 0   temps: 78 C
        Summary at:   Wed Aug 23 00:52:19 UTC 2023

64.2%  proc'd: 2600 (36109 Gflop/s)   errors: 0   temps: 80 C
        Summary at:   Wed Aug 23 00:52:32 UTC 2023

75.0%  proc'd: 3000 (35894 Gflop/s)   errors: 0   temps: 80 C
        Summary at:   Wed Aug 23 00:52:45 UTC 2023

86.7%  proc'd: 3450 (35780 Gflop/s)   errors: 0   temps: 81 C
        Summary at:   Wed Aug 23 00:52:59 UTC 2023

97.5%  proc'd: 3900 (35670 Gflop/s)   errors: 0   temps: 81 C
        Summary at:   Wed Aug 23 00:53:12 UTC 2023

100.0%  proc'd: 4050 (35502 Gflop/s)   errors: 0   temps: 81 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 16375 MB of memory (14897 MB available, using 13407 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 50 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 1.8s (14/14) FINISHED                                                                                        docker:default
 => [internal] load .dockerignore                                                                                                   0.0s
 => => transferring context: 2B                                                                                                     0.0s
 => [internal] load build definition from Dockerfile                                                                                0.0s
 => => transferring dockerfile: 447B                                                                                                0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                          0.5s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                            0.5s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8  0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b040  0.0s
 => [internal] load build context                                                                                                   0.0s
 => => transferring context: 2.62kB                                                                                                 0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                             0.0s
 => [builder 3/4] COPY . /build/                                                                                                    0.0s
 => [builder 4/4] RUN make                                                                                                          1.1s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                  0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                               0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                               0.0s
 => exporting to image                                                                                                              0.0s
 => => exporting layers                                                                                                             0.0s
 => => writing image sha256:c2251b27fe14d1aa8b4da51bcb4051c3c1ffabacf1e626e90579b13adb5833c4                                        0.0s
 => => naming to docker.io/library/gpu-burn                                                                                         0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
71.7%  proc'd: 24 (305 Gflop/s)   errors: 0   temps: 71 C
        Summary at:   Wed Aug 23 00:56:10 UTC 2023

83.3%  proc'd: 24 (305 Gflop/s)   errors: 0   temps: 72 C
        Summary at:   Wed Aug 23 00:56:24 UTC 2023

95.8%  proc'd: 24 (305 Gflop/s)   errors: 0   temps: 72 C
        Summary at:   Wed Aug 23 00:56:39 UTC 2023

100.0%  proc'd: 24 (305 Gflop/s)   errors: 0   temps: 72 C
Killing processes with SIGTERM (soft kill)

Killing processes with SIGKILL (force kill)
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

obriensystems commented 10 months ago

NVIDIA TX-A4500 single PCIe-x16 13900K 1600W PSU - 128G DDR5


micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4500 (UUID: GPU-898f04be-4440-1d50-e748-afe4f4f8abd0)
Using compare file: compare.ptx
Burning for 120 seconds.
81.7%  proc'd: 31 (349 Gflop/s)   errors: 0   temps: 51 C
        Summary at:   Fri Aug 25 01:56:03 UTC 2023

91.7%  proc'd: 31 (349 Gflop/s)   errors: 0   temps: 53 C
        Summary at:   Fri Aug 25 01:56:15 UTC 2023

100.0%  proc'd: 31 (349 Gflop/s)   errors: 0   temps: 54 C
Killing processes with SIGTERM (soft kill)

Killing processes with SIGKILL (force kill)
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 2.0s (14/14) FINISHED                                                                                                            docker:default
 => [internal] load build definition from Dockerfile                                                                                                    0.0s
 => => transferring dockerfile: 441B                                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                         0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                                              0.6s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                                                0.5s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8148c                  0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b04023b30e                0.0s
 => [internal] load build context                                                                                                                       0.0s
 => => transferring context: 2.62kB                                                                                                                     0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                                                 0.0s
 => [builder 3/4] COPY . /build/                                                                                                                        0.0s
 => [builder 4/4] RUN make                                                                                                                              1.2s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                                      0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                                                   0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                                                   0.0s
 => exporting to image                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                 0.0s
 => => writing image sha256:d79438766981c3c94306c18a225e4c873f36f05d7614a2d147016eb9947132f1                                                            0.0s
 => => naming to docker.io/library/gpu-burn                                                                                                             0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ cat Dockerfile
ARG CUDA_VERSION=11.8.0
ARG IMAGE_DISTRO=ubi8

FROM nvidia/cuda:${CUDA_VERSION}-devel-${IMAGE_DISTRO} AS builder

WORKDIR /build

COPY . /build/

RUN make

FROM nvidia/cuda:${CUDA_VERSION}-runtime-${IMAGE_DISTRO}

COPY --from=builder /build/gpu_burn /app/
COPY --from=builder /build/compare.ptx /app/

WORKDIR /app

CMD ["./gpu_burn", "120"]
#CMD ["./gpu_burn", "-d", "-tc", "120"]

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4500 (UUID: GPU-898f04be-4440-1d50-e748-afe4f4f8abd0)
Using compare file: compare.ptx
Burning for 120 seconds.
12.5%  proc'd: 128 (14575 Gflop/s)   errors: 0   temps: 58 C
        Summary at:   Fri Aug 25 01:57:46 UTC 2023

25.0%  proc'd: 320 (14430 Gflop/s)   errors: 0   temps: 63 C
        Summary at:   Fri Aug 25 01:58:01 UTC 2023

37.5%  proc'd: 512 (14276 Gflop/s)   errors: 0   temps: 68 C
        Summary at:   Fri Aug 25 01:58:16 UTC 2023

50.0%  proc'd: 704 (14178 Gflop/s)   errors: 0   temps: 70 C
        Summary at:   Fri Aug 25 01:58:31 UTC 2023

62.5%  proc'd: 896 (14103 Gflop/s)   errors: 0   temps: 72 C
        Summary at:   Fri Aug 25 01:58:46 UTC 2023

75.0%  proc'd: 1088 (14026 Gflop/s)   errors: 0   temps: 73 C
        Summary at:   Fri Aug 25 01:59:01 UTC 2023

87.5%  proc'd: 1280 (14007 Gflop/s)   errors: 0   temps: 73 C
        Summary at:   Fri Aug 25 01:59:16 UTC 2023

100.0%  proc'd: 1472 (13981 Gflop/s)   errors: 0   temps: 74 C
        Summary at:   Fri Aug 25 01:59:31 UTC 2023

100.0%  proc'd: 1536 (13988 Gflop/s)   errors: 0   temps: 74 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 20469 MB of memory (18956 MB available, using 17060 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 64 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ cat Dockerfile
ARG CUDA_VERSION=11.8.0
ARG IMAGE_DISTRO=ubi8

FROM nvidia/cuda:${CUDA_VERSION}-devel-${IMAGE_DISTRO} AS builder

WORKDIR /build

COPY . /build/

RUN make

FROM nvidia/cuda:${CUDA_VERSION}-runtime-${IMAGE_DISTRO}

COPY --from=builder /build/gpu_burn /app/
COPY --from=builder /build/compare.ptx /app/

WORKDIR /app

CMD ["./gpu_burn", "-tc", "120"]
#CMD ["./gpu_burn", "-d", "-tc", "120"]

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 0.4s (14/14) FINISHED                                                                                                            docker:default
 => [internal] load build definition from Dockerfile                                                                                                    0.0s
 => => transferring dockerfile: 448B                                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                                       0.0s
 => => transferring context: 2B                                                                                                                         0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                                              0.3s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                                                0.3s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8148c                  0.0s
 => [internal] load build context                                                                                                                       0.0s
 => => transferring context: 2.62kB                                                                                                                     0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b04023b30e                0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                                                 0.0s
 => CACHED [builder 3/4] COPY . /build/                                                                                                                 0.0s
 => CACHED [builder 4/4] RUN make                                                                                                                       0.0s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                                      0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                                                   0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                                                   0.0s
 => exporting to image                                                                                                                                  0.0s
 => => exporting layers                                                                                                                                 0.0s
 => => writing image sha256:6b5d365fd9cd38a493e55b2226fbb4b5f9960fa47b1717d95c6bf553d8fc6f74                                                            0.0s
 => => naming to docker.io/library/gpu-burn                                                                                                             0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4500 (UUID: GPU-898f04be-4440-1d50-e748-afe4f4f8abd0)
Using compare file: compare.ptx
Burning for 120 seconds.
11.7%  proc'd: 576 (52795 Gflop/s)   errors: 0   temps: 53 C
        Summary at:   Fri Aug 25 02:01:49 UTC 2023

22.5%  proc'd: 1216 (52087 Gflop/s)   errors: 0   temps: 60 C
        Summary at:   Fri Aug 25 02:02:02 UTC 2023

33.3%  proc'd: 1792 (51904 Gflop/s)   errors: 0   temps: 64 C
        Summary at:   Fri Aug 25 02:02:15 UTC 2023

44.2%  proc'd: 2432 (51441 Gflop/s)   errors: 0   temps: 68 C
        Summary at:   Fri Aug 25 02:02:28 UTC 2023

55.8%  proc'd: 3072 (51009 Gflop/s)   errors: 0   temps: 71 C
        Summary at:   Fri Aug 25 02:02:42 UTC 2023

65.8%  proc'd: 3648 (51027 Gflop/s)   errors: 0   temps: 72 C
        Summary at:   Fri Aug 25 02:02:54 UTC 2023

77.5%  proc'd: 4288 (50850 Gflop/s)   errors: 0   temps: 73 C
        Summary at:   Fri Aug 25 02:03:08 UTC 2023

89.2%  proc'd: 4928 (50397 Gflop/s)   errors: 0   temps: 74 C
        Summary at:   Fri Aug 25 02:03:22 UTC 2023

100.0%  proc'd: 5504 (50467 Gflop/s)   errors: 0   temps: 74 C
        Summary at:   Fri Aug 25 02:03:35 UTC 2023

100.0%  proc'd: 5568 (50563 Gflop/s)   errors: 0   temps: 74 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 20469 MB of memory (18956 MB available, using 17060 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 64 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPUs:
        GPU 0: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)

obriensystems commented 10 months ago

dual RTX-A4500 and RTX-A4000

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4500 (UUID: GPU-898f04be-4440-1d50-e748-afe4f4f8abd0)
GPU 1: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
71.7%  proc'd: 0 (0 Gflop/s) - 24 (306 Gflop/s)   errors: 0 - 0   temps: 56 C - 67 C
        Summary at:   Fri Aug 25 02:24:51 UTC 2023

83.3%  proc'd: 31 (350 Gflop/s) - 24 (306 Gflop/s)   errors: 0 - 0   temps: 57 C - 69 C
        Summary at:   Fri Aug 25 02:25:05 UTC 2023

95.8%  proc'd: 31 (350 Gflop/s) - 24 (306 Gflop/s)   errors: 0 - 0   temps: 59 C - 70 C
        Summary at:   Fri Aug 25 02:25:20 UTC 2023

100.0%  proc'd: 31 (350 Gflop/s) - 24 (306 Gflop/s)   errors: 0 - 0   temps: 60 C - 71 C
Killing processes with SIGTERM (soft kill)

Killing processes with SIGKILL (force kill)
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 0.4s (14/14) FINISHED                                                                                            docker:default
 => [internal] load build definition from Dockerfile                                                                                    0.0s
 => => transferring dockerfile: 448B                                                                                                    0.0s
 => [internal] load .dockerignore                                                                                                       0.0s
 => => transferring context: 2B                                                                                                         0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                              0.3s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                                0.3s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8148c  0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b04023b3  0.0s
 => [internal] load build context                                                                                                       0.0s
 => => transferring context: 2.62kB                                                                                                     0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                                 0.0s
 => CACHED [builder 3/4] COPY . /build/                                                                                                 0.0s
 => CACHED [builder 4/4] RUN make                                                                                                       0.0s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                      0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                                   0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                                   0.0s
 => exporting to image                                                                                                                  0.0s
 => => exporting layers                                                                                                                 0.0s
 => => writing image sha256:6b5d365fd9cd38a493e55b2226fbb4b5f9960fa47b1717d95c6bf553d8fc6f74                                            0.0s
 => => naming to docker.io/library/gpu-burn                                                                                             0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4500 (UUID: GPU-898f04be-4440-1d50-e748-afe4f4f8abd0)
GPU 1: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
10.8%  proc'd: 576 (51886 Gflop/s) - 400 (39447 Gflop/s)   errors: 0 - 0   temps: 64 C - 68 C
        Summary at:   Fri Aug 25 02:26:35 UTC 2023

21.7%  proc'd: 1152 (51415 Gflop/s) - 850 (38560 Gflop/s)   errors: 0 - 0   temps: 69 C - 76 C
        Summary at:   Fri Aug 25 02:26:48 UTC 2023

32.5%  proc'd: 1728 (50876 Gflop/s) - 1350 (37509 Gflop/s)   errors: 0 - 0   temps: 72 C - 79 C
        Summary at:   Fri Aug 25 02:27:01 UTC 2023

43.3%  proc'd: 2368 (50651 Gflop/s) - 1750 (37019 Gflop/s)   errors: 0 - 0   temps: 74 C - 81 C
        Summary at:   Fri Aug 25 02:27:14 UTC 2023

53.3%  proc'd: 2944 (50475 Gflop/s) - 2150 (36643 Gflop/s)   errors: 0 - 0   temps: 75 C - 82 C
        Summary at:   Fri Aug 25 02:27:26 UTC 2023

64.2%  proc'd: 3456 (50342 Gflop/s) - 2600 (36328 Gflop/s)   errors: 0 - 0   temps: 76 C - 83 C
        Summary at:   Fri Aug 25 02:27:39 UTC 2023

75.0%  proc'd: 4096 (50140 Gflop/s) - 3000 (36168 Gflop/s)   errors: 0 - 0   temps: 76 C - 83 C
        Summary at:   Fri Aug 25 02:27:52 UTC 2023

86.7%  proc'd: 4736 (49901 Gflop/s) - 3450 (36048 Gflop/s)   errors: 0 - 0   temps: 77 C - 84 C
        Summary at:   Fri Aug 25 02:28:06 UTC 2023

98.3%  proc'd: 5312 (49996 Gflop/s) - 3950 (35941 Gflop/s)   errors: 0 - 0   temps: 77 C - 84 C
        Summary at:   Fri Aug 25 02:28:20 UTC 2023

100.0%  proc'd: 5504 (49930 Gflop/s) - 4000 (35931 Gflop/s)   errors: 0 - 0   temps: 77 C - 84 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 1 with 16375 MB of memory (14897 MB available, using 13407 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 50 iterations
Freed memory for dev 1
Uninitted cublas
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 20469 MB of memory (18956 MB available, using 17060 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 64 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ vi Dockerfile

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker build -t gpu-burn .
[+] Building 0.7s (14/14) FINISHED                                                                                            docker:default
 => [internal] load .dockerignore                                                                                                       0.0s
 => => transferring context: 2B                                                                                                         0.0s
 => [internal] load build definition from Dockerfile                                                                                    0.0s
 => => transferring dockerfile: 441B                                                                                                    0.0s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-runtime-ubi8                                                              0.5s
 => [internal] load metadata for docker.io/nvidia/cuda:11.8.0-devel-ubi8                                                                0.6s
 => [builder 1/4] FROM docker.io/nvidia/cuda:11.8.0-devel-ubi8@sha256:c135690bac108cafde387faf5bd4f3007bd7c2f21db9099b573b472941d8148c  0.0s
 => [stage-1 1/4] FROM docker.io/nvidia/cuda:11.8.0-runtime-ubi8@sha256:3f05c79f4d11badbb70a26a08bd95b6019be501d670da05977e382b04023b3  0.0s
 => [internal] load build context                                                                                                       0.0s
 => => transferring context: 2.62kB                                                                                                     0.0s
 => CACHED [builder 2/4] WORKDIR /build                                                                                                 0.0s
 => CACHED [builder 3/4] COPY . /build/                                                                                                 0.0s
 => CACHED [builder 4/4] RUN make                                                                                                       0.0s
 => CACHED [stage-1 2/4] COPY --from=builder /build/gpu_burn /app/                                                                      0.0s
 => CACHED [stage-1 3/4] COPY --from=builder /build/compare.ptx /app/                                                                   0.0s
 => CACHED [stage-1 4/4] WORKDIR /app                                                                                                   0.0s
 => exporting to image                                                                                                                  0.0s
 => => exporting layers                                                                                                                 0.0s
 => => writing image sha256:d79438766981c3c94306c18a225e4c873f36f05d7614a2d147016eb9947132f1                                            0.0s
 => => naming to docker.io/library/gpu-burn                                                                                             0.0s

What's Next?
  View summary of image vulnerabilities and recommendations → docker scout quickview

micha@13900a MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu-burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX A4500 (UUID: GPU-898f04be-4440-1d50-e748-afe4f4f8abd0)
GPU 1: NVIDIA RTX A4000 (UUID: GPU-c585959e-3209-a20e-2522-e9420f268bc8)
Using compare file: compare.ptx
Burning for 120 seconds.
12.5%  proc'd: 128 (14803 Gflop/s) - 100 (11846 Gflop/s)   errors: 0 - 0   temps: 44 C - 46 C
        Summary at:   Fri Aug 25 03:09:09 UTC 2023

24.2%  proc'd: 320 (14745 Gflop/s) - 300 (11604 Gflop/s)   errors: 0 - 0   temps: 53 C - 60 C
        Summary at:   Fri Aug 25 03:09:23 UTC 2023

36.7%  proc'd: 512 (14578 Gflop/s) - 450 (11177 Gflop/s)   errors: 0 - 0   temps: 60 C - 69 C
        Summary at:   Fri Aug 25 03:09:38 UTC 2023

49.2%  proc'd: 704 (14425 Gflop/s) - 600 (10972 Gflop/s)   errors: 0 - 0   temps: 65 C - 75 C
        Summary at:   Fri Aug 25 03:09:53 UTC 2023

61.7%  proc'd: 960 (14229 Gflop/s) - 700 (10754 Gflop/s)   errors: 0 - 0   temps: 69 C - 78 C
        Summary at:   Fri Aug 25 03:10:08 UTC 2023

74.2%  proc'd: 1152 (14075 Gflop/s) - 850 (10549 Gflop/s)   errors: 0 - 0   temps: 72 C - 80 C
        Summary at:   Fri Aug 25 03:10:23 UTC 2023

86.7%  proc'd: 1344 (14003 Gflop/s) - 1000 (10429 Gflop/s)   errors: 0 - 0   temps: 74 C - 81 C
        Summary at:   Fri Aug 25 03:10:38 UTC 2023

99.2%  proc'd: 1536 (13938 Gflop/s) - 1150 (10339 Gflop/s)   errors: 0 - 0   temps: 75 C - 82 C
        Summary at:   Fri Aug 25 03:10:53 UTC 2023

100.0%  proc'd: 1536 (13938 Gflop/s) - 1200 (10314 Gflop/s)   errors: 0 - 0   temps: 75 C - 83 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 20469 MB of memory (18956 MB available, using 17060 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 64 iterations
Freed memory for dev 0
Uninitted cublas
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 1 with 16375 MB of memory (14897 MB available, using 13407 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 50 iterations
Freed memory for dev 1
Uninitted cublas
done

Tested 2 GPUs:
        GPU 0: OK
        GPU 1: OK

obriensystems commented 8 months ago

add dual RTX-4500

add single RTX-3500 on lenovo P1 Gen 6


micha@p1gen6 MINGW64 /c/wse_github/gpu-burn (master)
$ nvidia-smi
Mon Oct 30 19:48:43 2023
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.84                 Driver Version: 545.84       CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                     TCC/WDDM  | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA RTX 3500 Ada Gene...  WDDM  | 00000000:01:00.0 Off |                  Off |
| N/A   51C    P8               7W / 102W |    122MiB / 12282MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+

micha@p1gen6 MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu_burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX 3500 Ada Generation Laptop GPU (UUID: GPU-25326b0f-ad93-c319-7027-b0029d4aee8e)
Using compare file: compare.ptx
Burning for 60 seconds.
13.3%  proc'd: 74 (11539 Gflop/s)   errors: 0   temps: 77 C
        Summary at:   Mon Oct 30 23:44:52 UTC 2023

25.0%  proc'd: 111 (11484 Gflop/s)   errors: 0   temps: 80 C
        Summary at:   Mon Oct 30 23:44:59 UTC 2023

36.7%  proc'd: 222 (11651 Gflop/s)   errors: 0   temps: 81 C
        Summary at:   Mon Oct 30 23:45:06 UTC 2023

48.3%  proc'd: 296 (11519 Gflop/s)   errors: 0   temps: 81 C
        Summary at:   Mon Oct 30 23:45:13 UTC 2023

61.7%  proc'd: 370 (11379 Gflop/s)   errors: 0   temps: 83 C
        Summary at:   Mon Oct 30 23:45:21 UTC 2023

75.0%  proc'd: 444 (12958 Gflop/s)   errors: 0   temps: 85 C
        Summary at:   Mon Oct 30 23:45:29 UTC 2023

86.7%  proc'd: 555 (12996 Gflop/s)   errors: 0   temps: 85 C
        Summary at:   Mon Oct 30 23:45:36 UTC 2023

100.0%  proc'd: 629 (13163 Gflop/s)   errors: 0   temps: 85 C
        Summary at:   Mon Oct 30 23:45:44 UTC 2023

100.0%  proc'd: 666 (13124 Gflop/s)   errors: 0   temps: 85 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 60 seconds.
Initialized device 0 with 12281 MB of memory (11119 MB available, using 10007 MB of it), using FLOATS
Results are 268435456 bytes each, thus performing 37 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPUs:
        GPU 0: OK

running with -d

micha@p1gen6 MINGW64 /c/wse_github/gpu-burn (master)
$ docker run --rm --gpus all gpu_burn

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

GPU 0: NVIDIA RTX 3500 Ada Generation Laptop GPU (UUID: GPU-25326b0f-ad93-c319-7027-b0029d4aee8e)
Using compare file: compare.ptx
Burning for 60 seconds.

Killing processes with SIGTERM (soft kill)

Killing processes with SIGKILL (force kill)
done

running with -tc tensor cores

GPU 0: NVIDIA RTX 3500 Ada Generation Laptop GPU (UUID: GPU-25326b0f-ad93-c319-7027-b0029d4aee8e)
Using compare file: compare.ptx
Burning for 120 seconds.
10.8%  proc'd: 444 (39682 Gflop/s)   errors: 0   temps: 74 C
        Summary at:   Tue Oct 31 00:03:17 UTC 2023

21.7%  proc'd: 888 (39301 Gflop/s)   errors: 0   temps: 78 C
        Summary at:   Tue Oct 31 00:03:30 UTC 2023

32.5%  proc'd: 1369 (38544 Gflop/s)   errors: 0   temps: 81 C
        Summary at:   Tue Oct 31 00:03:43 UTC 2023

43.3%  proc'd: 1813 (38330 Gflop/s)   errors: 0   temps: 83 C
        Summary at:   Tue Oct 31 00:03:56 UTC 2023

53.3%  proc'd: 2257 (40833 Gflop/s)   errors: 0   temps: 82 C
        Summary at:   Tue Oct 31 00:04:08 UTC 2023

64.2%  proc'd: 2738 (45627 Gflop/s)   errors: 0   temps: 83 C
        Summary at:   Tue Oct 31 00:04:21 UTC 2023

75.0%  proc'd: 3256 (44635 Gflop/s)   errors: 0   temps: 82 C
        Summary at:   Tue Oct 31 00:04:34 UTC 2023

85.8%  proc'd: 3811 (45740 Gflop/s)   errors: 0   temps: 82 C
        Summary at:   Tue Oct 31 00:04:47 UTC 2023

96.7%  proc'd: 4329 (46204 Gflop/s)   errors: 0   temps: 83 C
        Summary at:   Tue Oct 31 00:05:00 UTC 2023

100.0%  proc'd: 4551 (45895 Gflop/s)   errors: 0   temps: 82 C
Killing processes with SIGTERM (soft kill)
Using compare file: compare.ptx
Burning for 120 seconds.
Initialized device 0 with 12281 MB of memory (11119 MB available, using 10007 MB of it), using FLOATS, using Tensor Cores
Results are 268435456 bytes each, thus performing 37 iterations
Freed memory for dev 0
Uninitted cublas
done

Tested 1 GPUs:
        GPU 0: OK

Hardware issues to fix https://forums.lenovo.com/t5/ThinkPad-P-and-W-Series-Mobile-Workstations/P1-Gen6-Bricked-after-BSOD-second-laptop-with-the-same-problem/m-p/5254145

ObrienlabsDev / blog

High Performance Computing: CUDA and GCP #1

Use Cases

LLM and Generative AI

Collatz

Implementation