Closed anilmartha closed 3 years ago
Hi @anilmartha - I have updated the dockerfile and it looks to be working. I have provided the commands below. You can also use the dockerfiles to build your own copy of the docker locally - instructions here.
Please let me know if this works, we can close this issue. Thanks!
$ sudo docker pull kiritigowda/ubuntu-18.04:mivisionx-level-5
mivisionx-level-5: Pulling from kiritigowda/ubuntu-18.04
Digest: sha256:50c221e04f61b6e73281effa12a7d18beebfde741b60770bbe58734907d366e3
Status: Image is up to date for kiritigowda/ubuntu-18.04:mivisionx-level-5
docker.io/kiritigowda/ubuntu-18.04:mivisionx-level-5
$ sudo docker run -it -v /home/:/root/hostDrive/ --device=/dev/kfd --device=/dev/dri --cap-add=SYS_RAWIO --device=/dev/mem --group-add video --network host kiritigowda/ubuntu-18.04:mivisionx-level-5
root@ThreadRipperUbuntu:/# cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
root@ThreadRipperUbuntu:/#
PR #581 - pushes all the updates required to build and run docker successfully
@anilmartha - the docker hub is updated and hopefully this issue is fixed. If you see this issue persists, please reopen this issue.
Hi @kiritigowda I tried your commands without sudo
anilm@xsjfislx32:~$ docker pull kiritigowda/ubuntu-18.04:mivisionx-level-5
mivisionx-level-5: Pulling from kiritigowda/ubuntu-18.04
feac53061382: Pull complete
399b54750ca5: Pull complete
327c626ba280: Pull complete
7cae3695c5fe: Pull complete
d0dbcba153c0: Pull complete
b064a93ff4df: Pull complete
86cf5ef501f2: Pull complete
6542de449694: Pull complete
8f8dce634ab4: Pull complete
a11c7ac1ced9: Pull complete
64f38a08891a: Pull complete
Digest: sha256:cbf8d7ca0a0ed198fb44b6b8a6273fd09a20348246db6e9ceb6a03b28e9e316f
Status: Downloaded newer image for kiritigowda/ubuntu-18.04:mivisionx-level-5
docker.io/kiritigowda/ubuntu-18.04:mivisionx-level-5
anilm@xsjfislx32:~$ docker run -it -v /home/:/root/hostDrive/ --device=/dev/kfd --device=/dev/dri --cap-add=SYS_RAWIO --device=/dev/mem --group-add video --network host kiritigowda/ubuntu-18.04:mivisionx-level-5
docker: Error response from daemon: failed to create shim: OCI runtime create failed: invalid mount {Destination::/root/.Xauthority Type:bind Source:/scratch/docker/volumes/38a1799ee360a589df87ac083f003654c6ec103bf1915719bb59d2fe60ffebf3/_data Options:[rbind]}: mount destination :/root/.Xauthority not absolute: unknown.
ERRO[0000] error waiting for container: context canceled
I am still seeing the same error, is the sudo a must?
@anilmartha - A couple of things
docker run -it --network host kiritigowda/ubuntu-18.04:mivisionx-level-5
rocminfo
and send me the log?
/opt/rocm/bin/rocminfo
@kiritigowda
We have a MI100 GPU. I was able to run the rocm/rocm-terminal:4.2 docker, but not the kiritigowda/ubuntu-18.04:mivisionx-level-5 docker. Btw, I was able to launch the docker container after building the docker from this dockerfile. For some reason, your digest key for kiritigowda/ubuntu-18.04:mivisionx-level-5 appears to be different for me. My docker pull command output is shown below.
anilm@xsjfislx31:~$ docker pull kiritigowda/ubuntu-18.04:mivisionx-level-5
mivisionx-level-5: Pulling from kiritigowda/ubuntu-18.04
feac53061382: Pull complete
399b54750ca5: Pull complete
327c626ba280: Pull complete
7cae3695c5fe: Pull complete
d0dbcba153c0: Pull complete
b064a93ff4df: Pull complete
86cf5ef501f2: Pull complete
6542de449694: Pull complete
8f8dce634ab4: Pull complete
a11c7ac1ced9: Pull complete
64f38a08891a: Pull complete
Digest: sha256:cbf8d7ca0a0ed198fb44b6b8a6273fd09a20348246db6e9ceb6a03b28e9e316f
Status: Downloaded newer image for kiritigowda/ubuntu-18.04:mivisionx-level-5
rocminfo info from rocm/rocm-terminal:4.2 docker as follows
anilm@xsjfislx31:~$ docker run -it --rm --device=/dev/kfd --device=/dev/dri --group-add video rocm/rocm-terminal:4.2
To run a command as administrator (user "root"), use "sudo <command>".
See "man sudo_root" for details.
rocm-user@26100bd9de51:~$ rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD EPYC 7F52 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD EPYC 7F52 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 0
BDFID: 0
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 263801844(0xfb94bf4) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 263801844(0xfb94bf4) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*******
Agent 2
*******
Name: AMD EPYC 7F52 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD EPYC 7F52 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 1
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 0
BDFID: 0
Internal Node ID: 1
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 264230432(0xfbfd620) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 264230432(0xfbfd620) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*******
Agent 3
*******
Name: gfx908
Uuid: GPU-XX
Marketing Name: Device 738c
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 2
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 29580(0x738c)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1502
BDFID: 9984
Internal Node ID: 2
Compute Unit: 120
SIMDs per CU: 4
Shader Engines: 8
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 33538048(0x1ffc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx908:sramecc+:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
@anilmartha I have deleted the old tags and created newer tags. I build all the newer level images from the dockerfiles here
Let me know if this works, else you might have to do a delete all old docker images and pull again.
kiritigowda/ubuntu-18.04:mivisionx-level-5
DIGEST:sha256:85890a3b0e90351c20c7a9d33cb601eb0c00f56b5d75afa7ffae25c9bbfc6e3a
With the latest updates
sudo docker pull kiritigowda/ubuntu-18.04:mivisionx-level-5
Pulling from kiritigowda/ubuntu-18.04
Digest: sha256:85890a3b0e90351c20c7a9d33cb601eb0c00f56b5d75afa7ffae25c9bbfc6e3a
Status: Image is up to date for kiritigowda/ubuntu-18.04:mivisionx-level-5
docker.io/kiritigowda/ubuntu-18.04:mivisionx-level-5
sudo docker run -it --device=/dev/kfd --device=/dev/dri --cap-add=SYS_RAWIO --device=/dev/mem --group-add video --network host --env DISPLAY=unix$DISPLAY --privileged --volume $XAUTH:/root/.Xauthority --volume /tmp/.X11-unix/:/tmp/.X11-unix kiritigowda/ubuntu-18.04:mivisionx-level-5
/opt/rocm/bin/rocminfo
ROCk module is loaded
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen Threadripper 1950X 16-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen Threadripper 1950X 16-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3400
BDFID: 0
Internal Node ID: 0
Compute Unit: 32
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 131858024(0x7dbfe68) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 131858024(0x7dbfe68) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
N/A
*******
Agent 2
*******
Name: gfx906
Uuid: GPU-179c714172dc76b5
Marketing Name: Device 66af
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 4096(0x1000)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
Chip ID: 26287(0x66af)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 1801
BDFID: 17152
Internal Node ID: 1
Compute Unit: 60
SIMDs per CU: 4
Shader Engines: 4
Shader Arrs. per Eng.: 1
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: FALSE
Wavefront Size: 64(0x40)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 2560(0xa00)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16760832(0xffc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx906:sramecc-:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Remove docker images
sudo docker image prune --all --filter "until=1500h"
@anilmartha - if the issue persists or if you have problems with the docker, please reopen this issue.
When launching kiritigowda/ubuntu-18.04:mivisionx-level-5 docker container seeing below error
docker: Error response from daemon: failed to create shim: OCI runtime create failed: invalid mount {Destination::/root/.Xauthority Type:bind Source:/scratch/docker/volumes/0a199b68f742c6acbedc34c186e5179c1d617704d65e3f2e41f3f8a362d4efb2/_data Options:[rbind]}: mount destination :/root/.Xauthority not absolute: unknown. ERRO[0000] error waiting for container: context canceled
Command
Docker version Docker version 20.10.2, build 20.10.2-0ubuntu1~18.04.3