apache / mxnet

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more
https://mxnet.apache.org
Apache License 2.0
20.78k stars 6.79k forks source link

mxnet/tensorRT docker image coredumps #16499

Open wx3000 opened 5 years ago

wx3000 commented 5 years ago

Description

The docker image mxnet/tensorrt which integrates mxnet with tensorrt core dumps. docker image has TRT 4.0, python3.5, cuda9 and MxNet 1.3.

Environment info (Required)

AWS base DLAMI on G4 instance. DLAMI version is 19.2 and OS is ubuntu. Installed docker 19.2 and nvidia docker on top of it.

first install docker: https://docs.docker.com/install/linux/docker-ce/ubuntu/

there is one issue with a workaround: https://github.com/docker/for-linux/issues/813 sudo apt-get install runc=1.0.0~rc7+git20190403.029124da-0ubuntu1~16.04.4 sudo apt-get install docker-ce

then install latest nvidia docker (which requires docker 19.03) https://github.com/NVIDIA/nvidia-docker

then pull the image $ docker run --gpus all -it mxnet/tensorrt bash

$ cat /proc/cpuinfo | grep flags flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx avx512f rdseed adx smap clflushopt clwb avx512cd xsaveopt xsavec xgetbv1 ida arat pku

I can provide more details if needed.

Error and steps to reproduce

I followed the python scripts on this page for mxnet/tensorRT integration. https://github.com/apache/incubator-mxnet/blob/8004a027ad6a73f8f6eae102de8d249fbdfb9a2d/docs/python_docs/python/tutorials/performance/backend/tensorrt/tensorrt.md

[01:51:10] src/c_api/c_api_executor.cc:464: TensorRT not enabled by default. Please set the MXNET_USE_TENSORRT environment variable to 1 or call mx.contrib.tensorrt.set_use_tensorrt(True) to enable.

Warming up MXNet Segmentation fault (core dumped)

mxnet-label-bot commented 5 years ago

Hey, this is the MXNet Label Bot. Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it. Here are my recommended label(s): Feature, Performance

leezu commented 5 years ago

Who owns the Docker images at https://hub.docker.com/u/mxnet ? The tensorRT hasn't been updated since a year.

samskalicky commented 4 years ago

@lanking520 assign @larroy Pedro, please add to the backlog for CI enhancements to build the docker containers

larroy commented 4 years ago

ok

leezu commented 4 years ago

It should be possible to fix this by extending https://github.com/apache/incubator-mxnet/pull/16547 ?