deepjavalibrary / djl

An Engine-Agnostic Deep Learning Framework in Java
https://djl.ai
Apache License 2.0
4.06k stars 648 forks source link

memory free error when closing model #2526

Open larochef opened 1 year ago

larochef commented 1 year ago

Description

When I close a model, I have the following error:

free(): invalid pointer

it also happens when the app exits and the memory is cleared.

It happens on linux, using PyTorch, got it on cpu and also on cuda. The program also uses javafx.

Expected Behavior

no error and program exitting normally

Error Message

(Paste the complete error message, including stack trace.)

How to Reproduce?

(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. clone https://github.com/larochef/javafx-djl
  2. type ./gradlew run

What have you tried to solve it?

  1. setting PYTORCH_PRECXX11 to true / false

Environment Info

Please run the command ./gradlew debugEnv from the root directory of DJL (if necessary, clone DJL first). It will output information about your system, environment, and installation that can help us debug your issue. Paste the output of the command below:

> Task :integration:debugEnv
[DEBUG] - Registering EngineProvider: XGBoost
[DEBUG] - Registering EngineProvider: LightGBM
[DEBUG] - Registering EngineProvider: OnnxRuntime
[DEBUG] - Registering EngineProvider: MXNet
[DEBUG] - Registering EngineProvider: PyTorch
[DEBUG] - Registering EngineProvider: TensorFlow
[DEBUG] - Found default engine: MXNet
----------- System Properties -----------
java.specification.version: 17
sun.jnu.encoding: UTF-8
java.class.path: /home/francois/dev/opensource/djl/integration/build/classes/java/main:/home/francois/dev/opensource/djl/integration/build/resources/main:/home/francois/.gradle/caches/modules-2/files-2.1/commons-cli/commons-cli/1.5.0/dc98be5d5390230684a092589d70ea76a147925c/commons-cli-1.5.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-slf4j-impl/2.19.0/1a0c9615ba9fd5b96db8c1136afbef4394286e93/log4j-slf4j-impl-2.19.0.jar:/home/francois/dev/opensource/djl/basicdataset/build/libs/basicdataset-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/model-zoo/build/libs/model-zoo-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/testing/build/libs/testing-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/mxnet/mxnet-model-zoo/build/libs/mxnet-model-zoo-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/pytorch/pytorch-model-zoo/build/libs/pytorch-model-zoo-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/pytorch/pytorch-jni/build/libs/pytorch-jni-1.13.1-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/tensorflow/tensorflow-model-zoo/build/libs/tensorflow-model-zoo-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/ml/xgboost/build/libs/xgboost-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/ml/lightgbm/build/libs/lightgbm-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/onnxruntime/onnxruntime-engine/build/libs/onnxruntime-engine-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/mxnet/mxnet-engine/build/libs/mxnet-engine-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/pytorch/pytorch-engine/build/libs/pytorch-engine-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/engines/tensorflow/tensorflow-engine/build/libs/tensorflow-engine-0.22.0-SNAPSHOT.jar:/home/francois/dev/opensource/djl/api/build/libs/api-0.22.0-SNAPSHOT.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.testng/testng/7.7.1/8e96c60d4967a8df6dc06c3c7cf22392e3a51794/testng-7.7.1.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-api/1.7.36/6c62681a2f655b49963a5983b8b0950a6120ae14/slf4j-api-1.7.36.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-core/2.19.0/3b6eeb4de4c49c0fe38a4ee27188ff5fee44d0bb/log4j-core-2.19.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.apache.logging.log4j/log4j-api/2.19.0/ea1b37f38c327596b216542bc636cfdc0b8036fa/log4j-api-2.19.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.apache.commons/commons-csv/1.10.0/8669bee353424c3223c93723291b5c3753260c1c/commons-csv-1.10.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/ml.dmlc/xgboost4j_2.12/1.7.3/6afde777a1b8b62ad1ea8376e4affaf9c9542a1f/xgboost4j_2.12-1.7.3.jar:/home/francois/.gradle/caches/modules-2/files-2.1/commons-logging/commons-logging/1.2/4bfc12adfe4842bf07b657f0369c4cb522955686/commons-logging-1.2.jar:/home/francois/.gradle/caches/modules-2/files-2.1/com.microsoft.ml.lightgbm/lightgbmlib/3.2.110/f6c85e5d7cc44d49c4544240ea5c96004680007b/lightgbmlib-3.2.110.jar:/home/francois/.gradle/caches/modules-2/files-2.1/com.microsoft.onnxruntime/onnxruntime/1.14.0/fb150fd72c1d2fbeea7bd53affd7c266930e3f98/onnxruntime-1.14.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/com.google.code.gson/gson/2.10.1/b3add478d4382b78ea20b1671390a858002feb6c/gson-2.10.1.jar:/home/francois/.gradle/caches/modules-2/files-2.1/net.java.dev.jna/jna/5.13.0/1200e7ebeedbe0d10062093f32925a912020e747/jna-5.13.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.apache.commons/commons-compress/1.22/691a8b4e6cf4248c3bc72c8b719337d5cb7359fa/commons-compress-1.22.jar:/home/francois/.gradle/caches/modules-2/files-2.1/com.beust/jcommander/1.82/a7c5fef184d238065de38f81bbc6ee50cca2e21/jcommander-1.82.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.webjars/jquery/3.6.1/d08df6250157cd2db3d9b01b11b76e9b7225083a/jquery-3.6.1.jar:/home/francois/dev/opensource/djl/engines/tensorflow/tensorflow-api/build/libs/tensorflow-api-0.22.0-SNAPSHOT.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.tensorflow/tensorflow-core-api/0.5.0/6dfb7f13a9d96e6c4bd0705f122bd00d3b596b0d/tensorflow-core-api-0.5.0.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.bytedeco/javacpp/1.5.8/a28ca7c27abaae8054060b963fbd667b4be72677/javacpp-1.5.8.jar:/home/francois/.gradle/caches/modules-2/files-2.1/com.google.protobuf/protobuf-java/3.21.9/ed1240d9231044ce6ccf1978512f6e44416bb7e7/protobuf-java-3.21.9.jar:/home/francois/.gradle/caches/modules-2/files-2.1/org.tensorflow/ndarray/0.4.0/7ab74f002dbec93944b7feb38de013afe8d4e8de/ndarray-0.4.0.jar
java.vm.vendor: Oracle Corporation
sun.arch.data.model: 64
user.variant: 
java.vendor.url: https://openjdk.java.net/
user.timezone: Europe/Paris
java.vm.specification.version: 17
os.name: Linux
sun.java.launcher: SUN_STANDARD
user.country: US
sun.boot.library.path: /usr/lib/jvm/java-17-openjdk/lib:/usr/lib/jvm/java-17-openjdk/lib
sun.java.command: ai.djl.integration.util.DebugEnvironment
jdk.debug: release
sun.cpu.endian: little
user.home: /home/francois
org.gradle.appname: gradlew
user.language: en
java.specification.vendor: Oracle Corporation
java.version.date: 2023-01-17
java.home: /usr/lib/jvm/java-17-openjdk
ai.djl.logging.level: debug
org.gradle.internal.http.connectionTimeout: 60000
file.separator: /
java.vm.compressedOopsMode: Zero based
line.separator: 

java.vm.specification.vendor: Oracle Corporation
java.specification.name: Java Platform API Specification
sun.management.compiler: HotSpot 64-Bit Tiered Compilers
java.runtime.version: 17.0.6+10
user.name: francois
path.separator: :
os.version: 6.2.8-arch1-1
java.runtime.name: OpenJDK Runtime Environment
file.encoding: UTF-8
java.vm.name: OpenJDK 64-Bit Server VM
java.vendor.url.bug: https://bugreport.java.com/bugreport/
java.io.tmpdir: /tmp
org.gradle.internal.http.socketTimeout: 120000
java.version: 17.0.6
user.dir: /home/francois/dev/opensource/djl/integration
os.arch: amd64
java.vm.specification.name: Java Virtual Machine Specification
native.encoding: UTF-8
java.library.path: /usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib
java.vm.info: mixed mode
java.vendor: N/A
java.vm.version: 17.0.6+10
sun.io.unicode.encoding: UnicodeLittle
library.jansi.path: /home/francois/.gradle/native/jansi/1.18/linux64
java.class.version: 61.0
org.gradle.internal.publish.checksums.insecure: true

--------- Environment Variables ---------
PATH: /usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
NVD_BACKEND: direct
LC_MEASUREMENT: fr_FR.UTF-8
INVOCATION_ID: dce38acf51ab4a49ac1ecfd1760c7de1
XAUTHORITY: /run/user/1000/.mutter-Xwaylandauth.I46W21
GDMSESSION: gnome
XDG_DATA_DIRS: /home/francois/.local/share/flatpak/exports/share:/var/lib/flatpak/exports/share:/usr/local/share/:/usr/share/
LC_TIME: fr_FR.UTF-8
MOTD_SHOWN: pam
DBUS_SESSION_BUS_ADDRESS: unix:path=/run/user/1000/bus
XDG_ACTIVATION_TOKEN: gnome-shell/IntelliJ IDEA Community Edition/1762-1-francois-numind_TIME5224353
XDG_CURRENT_DESKTOP: GNOME
JOURNAL_STREAM: 8:38327
MAIL: /var/spool/mail/francois
LC_PAPER: fr_FR.UTF-8
USERNAME: francois
SESSION_MANAGER: local/francois-numind:@/tmp/.ICE-unix/1744,unix/francois-numind:/tmp/.ICE-unix/1744
LOGNAME: francois
MANAGERPID: 1663
PWD: /home/francois/dev/opensource/djl
GJS_DEBUG_TOPICS: JS ERROR;JS LOG
SHELL: /usr/bin/fish
GIO_LAUNCHED_DESKTOP_FILE: /usr/share/applications/idea.desktop
OLDPWD: /home/francois/dev/opensource/djl
VISUAL: vim
TERM_SESSION_ID: e1a144d9-7545-44fc-8f53-74a13d226b6a
SYSTEMD_EXEC_PID: 1762
GNOME_SETUP_DISPLAY: :1
LS_COLORS: rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;33;01:or=40;31;01:mi=00:su=37;41:sg=30;43:ca=00:tw=30;42:ow=34;42:st=37;44:ex=01;32:*.tar=01;31:*.tgz=01;31:*.arc=01;31:*.arj=01;31:*.taz=01;31:*.lha=01;31:*.lz4=01;31:*.lzh=01;31:*.lzma=01;31:*.tlz=01;31:*.txz=01;31:*.tzo=01;31:*.t7z=01;31:*.zip=01;31:*.z=01;31:*.dz=01;31:*.gz=01;31:*.lrz=01;31:*.lz=01;31:*.lzo=01;31:*.xz=01;31:*.zst=01;31:*.tzst=01;31:*.bz2=01;31:*.bz=01;31:*.tbz=01;31:*.tbz2=01;31:*.tz=01;31:*.deb=01;31:*.rpm=01;31:*.jar=01;31:*.war=01;31:*.ear=01;31:*.sar=01;31:*.rar=01;31:*.alz=01;31:*.ace=01;31:*.zoo=01;31:*.cpio=01;31:*.7z=01;31:*.rz=01;31:*.cab=01;31:*.wim=01;31:*.swm=01;31:*.dwm=01;31:*.esd=01;31:*.avif=01;35:*.jpg=01;35:*.jpeg=01;35:*.mjpg=01;35:*.mjpeg=01;35:*.gif=01;35:*.bmp=01;35:*.pbm=01;35:*.pgm=01;35:*.ppm=01;35:*.tga=01;35:*.xbm=01;35:*.xpm=01;35:*.tif=01;35:*.tiff=01;35:*.png=01;35:*.svg=01;35:*.svgz=01;35:*.mng=01;35:*.pcx=01;35:*.mov=01;35:*.mpg=01;35:*.mpeg=01;35:*.m2v=01;35:*.mkv=01;35:*.webm=01;35:*.webp=01;35:*.ogm=01;35:*.mp4=01;35:*.m4v=01;35:*.mp4v=01;35:*.vob=01;35:*.qt=01;35:*.nuv=01;35:*.wmv=01;35:*.asf=01;35:*.rm=01;35:*.rmvb=01;35:*.flc=01;35:*.avi=01;35:*.fli=01;35:*.flv=01;35:*.gl=01;35:*.dl=01;35:*.xcf=01;35:*.xwd=01;35:*.yuv=01;35:*.cgm=01;35:*.emf=01;35:*.ogv=01;35:*.ogx=01;35:*.aac=00;36:*.au=00;36:*.flac=00;36:*.m4a=00;36:*.mid=00;36:*.midi=00;36:*.mka=00;36:*.mp3=00;36:*.mpc=00;36:*.ogg=00;36:*.ra=00;36:*.wav=00;36:*.oga=00;36:*.opus=00;36:*.spx=00;36:*.xspf=00;36:*~=00;90:*#=00;90:*.bak=00;90:*.old=00;90:*.orig=00;90:*.part=00;90:*.rej=00;90:*.swp=00;90:*.tmp=00;90:*.dpkg-dist=00;90:*.dpkg-old=00;90:*.ucf-dist=00;90:*.ucf-new=00;90:*.ucf-old=00;90:*.rpmnew=00;90:*.rpmorig=00;90:*.rpmsave=00;90:
XDG_SESSION_DESKTOP: gnome
SHLVL: 1
LIBVA_DRIVER_NAME: iHD
LC_MONETARY: fr_FR.UTF-8
TERM: xterm-256color
LANG: en_US.UTF-8
XDG_SESSION_TYPE: wayland
DISPLAY: :0
IDEA_JDK: /usr/lib/jvm/java-17-openjdk/
VDPAU_DRIVER: va_gl
WAYLAND_DISPLAY: wayland-0
XDG_SESSION_CLASS: user
GDM_LANG: en_US.UTF-8
IDEA_CLASSPATH: /usr/lib/jvm/java-17-openjdk//lib/*:/usr/lib/jvm/java-17-openjfx//lib/*
DESKTOP_SESSION: gnome
USER: francois
DESKTOP_STARTUP_ID: gnome-shell/IntelliJ IDEA Community Edition/1762-1-francois-numind_TIME5224353
XDG_MENU_PREFIX: gnome-
GIO_LAUNCHED_DESKTOP_FILE_PID: 8499
TERMINAL_EMULATOR: JetBrains-JediTerm
LC_NUMERIC: fr_FR.UTF-8
GJS_DEBUG_OUTPUT: stderr
SSH_AUTH_SOCK: /run/user/1000/keyring/ssh
EDITOR: vim
XDG_RUNTIME_DIR: /run/user/1000
HOME: /home/francois

-------------- Directories --------------
temp directory: /tmp
DJL cache directory: /home/francois/.djl.ai
Engine cache directory: /home/francois/.djl.ai

------------------ CUDA -----------------
GPU Count: 1
CUDA: 121
ARCH: 86
GPU(0) memory used: 205062144 bytes

----------------- Engines ---------------
DJL version: 0.22.0-SNAPSHOT
[WARN ] - No matching cuda flavor for linux found: cu121mkl/sm_86.
[DEBUG] - Using cache dir: /home/francois/.djl.ai/mxnet/1.9.1-mkl-linux-x86_64
[INFO ] - Downloading libgfortran.so.3 ...
[INFO ] - Downloading libgomp.so.1 ...
[INFO ] - Downloading libquadmath.so.0 ...
[INFO ] - Downloading libopenblas.so.0 ...
[INFO ] - Downloading libmxnet.so ...
[DEBUG] - Loading mxnet library from: /home/francois/.djl.ai/mxnet/1.9.1-mkl-linux-x86_64/libmxnet.so
[WARN ] - No matching cuda flavor for linux found: cu121mkl/sm_86.
Default Engine: MXNet:1.9.0, capabilities: [
        CPU_SSE,
        SIGNAL_HANDLER,
        LAPACK,
        BLAS_OPEN,
        CPU_SSE2,
        DIST_KVSTORE,
        CPU_SSE3,
        OPENMP,
        OPENCV,
        MKLDNN,
]
MXNet Library: /home/francois/.djl.ai/mxnet/1.9.1-mkl-linux-x86_64/libmxnet.so
Default Device: cpu()
PyTorch: 2
MXNet: 0
XGBoost: 10
LightGBM: 10
OnnxRuntime: 10
TensorFlow: 3

--------------- Hardware --------------
Available processors (cores): 20
Byte Order: LITTLE_ENDIAN
Free memory (bytes): 25151488
Maximum memory (bytes): 8342470656
Total memory available to JVM (bytes): 58720256
Heap committed: 58720256
Heap nonCommitted: 42532864
GCC: 
gcc (GCC) 12.2.1 20230201
Copyright (C) 2022 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
frankfliu commented 1 year ago

@larochef

I tried on mac and Windows, both works fine. I'm not familiar with javafx, when I try on ubuntu, I got: Cannot open display error.

larochef commented 1 year ago

Did you have some graphical environment running with ubuntu?

What's interesting with javafx is that it will bind on the system toolkit to create the window and components. For instance on linux, it will use gtk for the rendering.

We have it working fine on windows and mac too, sorry to have not stated it on the issue.