NVIDIA / nvidia-docker

Build and run Docker containers leveraging NVIDIA GPUs
Apache License 2.0
17.23k stars 2.03k forks source link

ARM64 Support #214

Closed scosaje closed 4 years ago

scosaje commented 8 years ago

Is there a version of the nvidia-docker for arm platform. specifically for the NVidia Jetson TK1 and TX1

3XX0 commented 8 years ago

Not right now but we will probably look into it for nvidia-docker 2.0

Gotrek77 commented 8 years ago

I made the same request to some Nvidia Engineers during last GTC2016 in Amsterdam. I hope that this feature will come as soon as possible. It's very interesting to use container into TX1 som.

shiquanwang commented 8 years ago

Have you succeeded running docker on TK1/TX1?

scosaje notifications@github.com于2016年10月10日 周一 22:16写道:

Is there a version of the nvidia-docker for arm platform. specifically for the NVidia Jetson TK1 and TX1

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/NVIDIA/nvidia-docker/issues/214, or mute the thread https://github.com/notifications/unsubscribe-auth/AAsuvqY07VtRF0NwoiuS72T2xHSqpWJhks5qyki0gaJpZM4KSnMt .

scosaje commented 8 years ago

thanks guys. would it be possible to use linux "cgroups on mesos" on arm as the alternate container in place of the nvidia-docker container while we await the implementation of the nvidia-docker for arm? Where cgroups will not suffice, I am willing to committing some funds for a focused job on the nvidia-docker for arm (TK1/TX1), which can then be donated to the community.

If anyone can, and is interested pls send mail direct to sunny.osaje@gmail.com.

Thanks.

On Wed, Oct 12, 2016 at 11:54 AM, Gotrek77 notifications@github.com wrote:

I made the same request to some Nvidia Engineers during last GTC2016 in Amsterdam. I hope that this feature will come as soon as possible. It's very interesting to use container into TX1 som.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/NVIDIA/nvidia-docker/issues/214#issuecomment-253181994, or mute the thread https://github.com/notifications/unsubscribe-auth/ASolKwDas1Z8in_m1ACz2ZZCo6o6tsIHks5qzLxtgaJpZM4KSnMt .

emanuelstanciu commented 8 years ago

@3XX0 Hi Jonathan, when do you think we will see ARM support? Also, would this be able then to run the tensorflow docker images? Or do we need to rebuild? Thanks

3XX0 commented 8 years ago

I can't really say, we have other priorities right now and the ARM support is somewhat tricky. I started looking into it but our driver is dramatically different on ARM and we would need extensive changes in nvidia-docker. We also need to tweak the L4T kernel to make it container friendly. I will update this issue if we make any progress.

Gotrek77 commented 7 years ago

Hello 3XX0, there is some news on nvidia-docker support to jetson tx1?

Thanks

vmayoral commented 7 years ago

Also interested in this topic.

@3XX0, can you describe in more detail what would be the changes needed for getting support for ARM 64 bits?. (targeting TX1) It seems this is a rather popular topic. My group has some bandwidth and wouldn't mind looking into this.

Thanks,

3XX0 commented 7 years ago

It's pretty complicated and it's going to take a lot of time.

First, we won't support ARM until the new 2.0 runtime becomes stable and works flawlessly on x86 (hopefully we will publish it soon). Secondly, we need to work with the L4T kernel team to support containers there. And lastly, we need to reimplement all the driver logic to work with the Tegra iGPU/dGPU stack bearing in mind that it needs to support all of our architectures (Drive PXs will probably be first) and that our Tegra drivers are drastically changing.

Gotrek77 commented 7 years ago

HI @3XX0, how can do to rise up priority to this feature? It's very important to us to have nvidia-docker on tx1.

gofrane commented 7 years ago

Hello guys, are there any news about the nvidia-docker version that can support Jeston TK1?. I am so interested in it.thank you.

Gotrek77 commented 7 years ago

Hello guys, now that the 1.0.0 was released, when you think will start the work on 2.0.0 (with jetson tx1 support)? When will be possible to try (in alfa o beta) it? Thanks in advance Giuseppe

Gotrek77 commented 7 years ago

Hello guys, any update? Thanks Giuseppe

GONZALORUIZ commented 7 years ago

I just got my TX2 and I am also interested in running everything as docker containers, so an update on this will be very helpful

flx42 commented 7 years ago

@Gotrek77 @GONZALORUIZ sorry but you're on your own for now, it's not on our near-term roadmap.

flesicek commented 7 years ago

We are developing vision application running on top of docker environment and our customers are looking for local solution where TX2 will fit very nicely. Could you please please increase priority of this request?

Gotrek77 commented 7 years ago

@flx42 we Need this for our product. There are different people that ask to increase priority or ask how can push your management in this direction. It very important to know a roadmap to have this feature. Thanks

flx42 commented 7 years ago

@flesicek @Gotrek77 I understand that this is causing an issue for you. However, there is nothing I can do right now, it's in the backlog but not on the roadmap for now. You are on your own if you want to try to make it work.

SPPhotonic commented 7 years ago

Bump, because Docker would be great. Also looking for cloud GTU platform access API.

kgdad commented 7 years ago

As an alternative to nvidia-docker until official support is available, we were able to get Docker running on the TX-2. Need to do some kernel modifications and pass in some parameters to Docker containers so they have access to the GPU, but it is working for those that want to try it.

You can check out this GitHub repo for more information, Tegra-Docker

epifanio commented 7 years ago

@kgdad Thanks for that repo, I own a TX1 and I'll work on adapting your approach for my device. @flx42 @3XX0 once we have a docker version running on our L4T, do you have some pointers on how to adapt nvidia-docker to run on our devices? - my aim is to run minpy on TX1 from docker.

kaisark commented 7 years ago

I installed docker on my TX1 today. I ran into a make issue with CONFIG_CGROUP_HUGETLB while building the custom kernel, so I omitted the optional CONFIG_CGROUP_HUGETLB change and the Image was built successfully. Docker images (e.g. FROM arm64v8/ubuntu:16.04) can now be used on the TX1.

The Tegra-Docker solution works (I have verified), but still feels a little like a work-around. Also, I believe the docker version (1.12.6 currently) they recommend is a little dated. It might be better to build from source or use a debian package. BTW, the Tegra-Docker solution sources/references the Jetsonhacks blog (link below) for custom kernel build instructions...

Tegra-Docker: https://github.com/Technica-Corporation/Tegra-Docker

building custom kernels: TX1/TX2 http://www.jetsonhacks.com/2017/08/07/build-kernel-ttyacm-module-nvidia-jetson-tx1/ https://github.com/jetsonhacks/buildJetsonTX1Kernel

docker-ce arm64: stable https://download.docker.com/linux/ubuntu/dists/xenial/pool/stable/arm64/

docker-ce install instructions: ubuntu https://docs.docker.com/engine/installation/linux/docker-ce/ubuntu/

To check kernel compatibility with docker, run the check-compatibility.sh script from docker site: https://docs.docker.com/engine/installation/linux/linux-postinstall/#chkconfig

-K


nvidia@tegra-ubuntu:~/bin$ bash check-config.sh info: reading kernel config from /proc/config.gz ...

Generally Necessary:

Optional Features:

Limits:


nvidia@tegra-ubuntu:~/bin$ docker info Containers: 4 Running: 1 Paused: 0 Stopped: 3 Images: 6 Server Version: 1.12.6 Storage Driver: devicemapper Pool Name: docker-179:33-2899445-pool Pool Blocksize: 65.54 kB Base Device Size: 10.74 GB Backing Filesystem: ext4 Data file: /dev/loop0 Metadata file: /dev/loop1 Data Space Used: 450.8 MB Data Space Total: 107.4 GB Data Space Available: 27.04 GB Metadata Space Used: 1.192 MB Metadata Space Total: 2.147 GB Metadata Space Available: 2.146 GB Thin Pool Minimum Free Space: 10.74 GB Udev Sync Supported: true Deferred Removal Enabled: false Deferred Deletion Enabled: false Deferred Deleted Device Count: 0 Data loop file: /var/lib/docker/devicemapper/devicemapper/data WARNING: Usage of loopback devices is strongly discouraged for production use. Use --storage-opt dm.thinpooldev to specify a custom block storage device. Metadata loop file: /var/lib/docker/devicemapper/devicemapper/metadata Library Version: 1.02.110 (2015-10-30) Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: null host bridge overlay Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: seccomp Kernel Version: 4.4.38-jetsonbot-doc-v0.3 Operating System: Ubuntu 16.04 LTS OSType: linux Architecture: aarch64 CPUs: 4 Total Memory: 3.888 GiB Name: tegra-ubuntu ID: GGQY:ZUKT:AEXF:NPL2:JDFZ:UHGJ:7XVI:PXVI:FQK2:F6MZ:PP4A:PEVD Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Insecure Registries: 127.0.0.0/8


nvidia@tegra-ubuntu:~/docker/dq$ docker run --device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu -v /usr/lib/aarch64-linux-gnu/tegra:/usr/lib/aarch64-linux-gnu/tegra device_query /cudaSamples/deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA Tegra X1" CUDA Driver Version / Runtime Version 8.0 / 8.0 CUDA Capability Major/Minor version number: 5.3 Total amount of global memory: 3981 MBytes (4174815232 bytes) ( 2) Multiprocessors, (128) CUDA Cores/MP: 256 CUDA Cores GPU Max Clock rate: 998 MHz (1.00 GHz) Memory Clock rate: 13 Mhz Memory Bus Width: 64-bit L2 Cache Size: 262144 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 32768 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: Yes Integrated GPU sharing Host Memory: Yes Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 0 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 8.0, NumDevs = 1, Device0 = NVIDIA Tegra X1 Result = PASS

check-config.sh.txt

Gotrek77 commented 6 years ago

Hi all, I want to try nvidia-docker 2 in tx1. Is It possibile ti build libnvidia-docker for arm64 or there is some issue at the moment?

Thanks Giuseppe

stanleynguyen commented 6 years ago

Any updates?

alejandroandreu commented 6 years ago

+1 for this feature.

waldemar-szostak commented 6 years ago

What's the point of having the docker-friendly kernel in the jetpack 3.2 if there's no docker for the jetson tx2 running the jetpack? :(

stanleynguyen commented 6 years ago

^

kgdad commented 6 years ago

Regular docker will work fine as long as you don't need GPU access. If you do need access to the GPU in your containers you need to make sure you give your containers access to some specific libraries and devices. Can read the details here, GitHub-Tegra_Docker

alejandroandreu commented 6 years ago

Is this a feature on the current roadmap for nvidia-docker?

Although the above comment points to a valid workaround, it doesn't have all the nice features that nvidia-docker provides... :disappointed:

laudai commented 6 years ago

+1 docker-nvidia 2.0 not support tx1 https://github.com/NVIDIA/nvidia-docker/wiki/Frequently-Asked-Questions#is-macos-supported

flx42 commented 6 years ago

Sorry, it's still not on the roadmap, given than the driver stacks are very different today.

sundl123 commented 6 years ago

We use Tegra-Docker on TX2 as a workaround, but we really hope that nvidia-docker can add support for tx2 platform offically.

haudren commented 6 years ago

Is there any chance that this feature will appear on the roadmap given the recent release of the Jetson Xavier?

sanjaybharadwaj commented 6 years ago

Hi, Is such a feature planned now for the newer devices? Any roadmaps ?

ngadhvi commented 5 years ago

I am waiting too..!

leontius commented 5 years ago

I am waiting too

FluxReality commented 5 years ago

... also waiting!

waseemkhan1989 commented 5 years ago

Hi,

I have created a docker image on Jetson TX2 which contains Nvidia drivers, CUDA and Cudnn libraries. I am trying to give access of GPU and CUDA drivers to this image through tx2-docker script (https://github.com/Technica-Corporation/Tegra-Docker) but no success. I think tx2-docker is running successfully which you can see below:

wkh@tegra-ubuntu:~/Tegra-Docker/bin$ ./tx2-docker run openhorizon/aarch64-tx2-cudabase Running an nvidia docker image docker run -e LD_LIBRARY_PATH=:/usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu/tegra:/usr/local/cuda/lib64 --net=host -v /usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu -v /usr/local/cuda/lib64:/usr/local/cuda/lib64 --device=/dev/nvhost-ctrl --device=/dev/nvhost-ctrl-gpu --device=/dev/nvhost-prof-gpu --device=/dev/nvmap --device=/dev/nvhost-gpu --device=/dev/nvhost-as-gpu openhorizon/aarch64-tx2-cudabase

But when I try to run devicequery inside my container, it give me the result:

root@bc1130fc6be4:/usr/local/cuda-8.0/samples/1_Utilities/deviceQuery# ./deviceQuery ./deviceQuery Starting...

CUDA Device Query (Runtime API) version (CUDART static linking)

cudaGetDeviceCount returned 38 -> no CUDA-capable device is detected Result = FAIL

Any comment! Why this script is not giving access?

Seanmatthews commented 5 years ago

How's this feature coming along?

vielmetti commented 5 years ago

Looking for this for the Jetson Nano.

OlivierGuilloux commented 5 years ago

Comming soon (july): https://devtalk.nvidia.com/default/topic/1052748/jetson-nano/egx-nvidia-docker-runtime-on-nano/

atline commented 5 years ago

Could this run on non-nvidia platform? I mean if I use a arm device with GPU, if possible I can install a docker version with GPU support, then my application could make use of GPU on arm device?

klueska commented 4 years ago

This is the support matrix for the recently released 1.1.0 package on Tuesday 19-May:

+----------------------+-------------+----------------+---------+-----------------+
|  OS Name / Version   |  Identifier | amd64 / x86_64 | ppc64le | arm64 / aarch64 |
+======================+=============+================+=========+=================+
| Amazon Linux 1       | amzn1       |       X        |         |                 |
| Amazon Linux 2       | amzn2       |       X        |         |                 |
| Amazon Linux 2017.09 | amzn2017.09 |       X        |         |                 |
| Amazon Linux 2018.03 | amzn2018.03 |       X        |         |                 |
| Open Suse Leap 15.0  | sles15.0    |       X        |         |                 |
| Open Suse Leap 15.1  | sles15.1    |       X        |         |                 |
| Debian Linux 9       | debian9     |       X        |         |                 |
| Debian Linux 10      | debian10    |       X        |         |                 |
| Centos 7             | centos7     |       X        |    X    |                 |
| Centos 8             | centos8     |       X        |    X    |        X        |
| RHEL 7.4             | rhel7.4     |       X        |    X    |                 |
| RHEL 7.5             | rhel7.5     |       X        |    X    |                 |
| RHEL 7.6             | rhel7.6     |       X        |    X    |                 |
| RHEL 7.7             | rhel7.7     |       X        |    X    |                 |
| RHEL 8.0             | rhel8.0     |       X        |    X    |        X        |
| RHEL 8.1             | rhel8.1     |       X        |    X    |        X        |
| RHEL 8.2             | rhel8.2     |       X        |    X    |        X        |
| Ubuntu 16.04         | ubuntu16.04 |       X        |    X    |                 |
| Ubuntu 18.04         | ubuntu18.04 |       X        |    X    |        X        |
| Ubuntu 19.04         | ubuntu19.04 |       X        |    X    |        X        |
| Ubuntu 19.10         | ubuntu19.10 |       X        |    X    |        X        |
| Ubuntu 20.04         | ubuntu20.04 |       X        |    X    |        X        |
+----------------------+-------------+----------------+---------+-----------------+

Please let us know if this resolves your issue.

nvjmayo commented 4 years ago

Listed as supported. Closed as resolved.