sonatype / nexus-public

Sonatype Nexus Repository Open-source codebase mirror
Eclipse Public License 1.0
1.9k stars 564 forks source link

Nexus is leaking file descriptors (files are being opened and not closed) #170

Open bom-d-van opened 1 year ago

bom-d-van commented 1 year ago
╰─$ k --cluster prod-k8s-cluster -n project-nexus exec pod/nexus-5f78ffdf5b-vklcb -it -- bash
bash-4.4$ ps -efwww
UID         PID   PPID  C STIME TTY          TIME CMD
nexus         1      0 21 Jun12 ?        1-17:12:02 /usr/lib/jvm/java-1.8.0-openjdk- -server -Dinstall4j.jvmDir=/usr/lib/jvm/java-1.8.0-openjdk- -Dexe4j.moduleName=/opt/sonatype/nexus/bin/nexus -XX:+UnlockDiagnosticVMOptions -Dinstall4j.launcherId=245 -Dinstall4j.swt=false -Di4jv=0 -Di4jv=0 -Di4jv=0 -Di4jv=0 -Di4jv=0 -Xms8g -Xmx8g -XX:+UseG1GC -XX:MaxDirectMemorySize=105158M -XX:+PrintGC -Djava.util.prefs.userRoot=/nexus-data -Xms2703m -Xmx2703m -XX:MaxDirectMemorySize=2703m -XX:+UnlockDiagnosticVMOptions -XX:+LogVMOutput -XX:LogFile=../sonatype-work/nexus3/log/jvm.log -XX:-OmitStackTraceInFastThrow -Dkaraf.home=. -Dkaraf.base=. -Dkaraf.etc=etc/karaf -Djava.util.logging.config.file=etc/karaf/ -Dkaraf.log=../sonatype-work/nexus3/log -Dkaraf.startLocalConsole=false -Djdk.tls.ephemeralDHKeySize=2048 -Djava.endorsed.dirs=lib/endorsed -Di4j.vpt=true -classpath /opt/sonatype/nexus/.install4j/i4jruntime.jar:/opt/sonatype/nexus/lib/boot/nexus-main.jar:/opt/sonatype/nexus/lib/boot/activation-1.1.1.jar:/opt/sonatype/nexus/lib/boot/jakarta.xml.bind-api-2.3.3.jar:/opt/sonatype/nexus/lib/boot/jaxb-runtime-2.3.3.jar:/opt/sonatype/nexus/lib/boot/txw2-2.3.3.jar:/opt/sonatype/nexus/lib/boot/istack-commons-runtime-3.0.10.jar:/opt/sonatype/nexus/lib/boot/org.apache.karaf.main-4.3.6.jar:/opt/sonatype/nexus/lib/boot/osgi.core-7.0.0.jar:/opt/sonatype/nexus/lib/boot/org.apache.karaf.specs.activator-4.3.6.jar:/opt/sonatype/nexus/lib/boot/org.apache.karaf.diagnostic.boot-4.3.6.jar:/opt/sonatype/nexus/lib/boot/org.apache.karaf.jaas.boot-4.3.6.jar com.install4j.runtime.launcher.UnixLauncher run 9d17dc87 0 0
nexus      3914      0  0 Jun13 pts/0    00:00:00 bash
nexus      5835      0  0 Jun13 pts/1    00:00:00 bash
nexus     48430      0  1 08:57 pts/2    00:00:00 bash
nexus     48435  48430  0 08:57 pts/2    00:00:00 ps -efwww
bash-4.4$ ls /proc/1/fd | wc -l
bom-d-van commented 1 year ago

The files are mostly from the tmp folder:

java    50073      200 *866w      REG  0,917       721  226090353 /nexus-data/tmp/.nfs000000000d79dd710000f1a3
lsof: no pwd entry for UID 200
java    50073      200 *867w      REG  0,917       837  226090354 /nexus-data/tmp/.nfs000000000d79dd720000f1a2
lsof: no pwd entry for UID 200
java    50073      200 *868w      REG  0,917    245453  226090370 /nexus-data/tmp/.nfs000000000d79dd820000f1a7
lsof: no pwd entry for UID 200
java    50073      200 *869w      REG  0,917    188166  226090371 /nexus-data/tmp/.nfs000000000d79dd830000f1a6
lsof: no pwd entry for UID 200
java    50073      200 *870r      REG  0,917     14328  231871778 /nexus-data/elasticsearch/nexus/nodes/0/indices/87ee0e1541af7dd4f6acd31436f9bb3d72176671/0/index/_e.cfs
lsof: no pwd entry for UID 200
java    50073      200 *871w      REG  0,917       558  226090355 /nexus-data/tmp/6874685871877237425
lsof: no pwd entry for UID 200
java    50073      200 *872w      REG  0,917   1657749  226090362 /nexus-data/tmp/.nfs000000000d79dd7a0000f1b1
lsof: no pwd entry for UID 200
java    50073      200 *873w      REG  0,917       371  226090356 /nexus-data/tmp/.nfs000000000d79dd740000f1a1
lsof: no pwd entry for UID 200
java    50073      200 *874w      REG  0,917       405  226090357 /nexus-data/tmp/.nfs000000000d79dd750000f1a0
lsof: no pwd entry for UID 200
java    50073      200 *875w      REG  0,917   1269598  226090380 /nexus-data/tmp/.nfs000000000d79dd8c0000f1b0
lsof: no pwd entry for UID 200
java    50073      200 *876w      REG  0,917    822282  226090372 /nexus-data/tmp/74981174112778502
lsof: no pwd entry for UID 200
java    50073      200 *877w      REG  0,917    234540  226090373 /nexus-data/tmp/.nfs000000000d79dd850000f1ad
lsof: no pwd entry for UID 200
java    50073      200 *878w      REG  0,917       488  226090358 /nexus-data/tmp/6537272694809463009
lsof: no pwd entry for UID 200
java    50073      200 *879w      REG  0,917       328  226090359 /nexus-data/tmp/.nfs000000000d79dd770000f19f
lsof: no pwd entry for UID 200
java    50073      200 *880w      REG  0,917       362  226090361 /nexus-data/tmp/.nfs000000000d79dd790000f19e
lsof: no pwd entry for UID 200
java    50073      200 *881w      REG  0,917    165123  226090374 /nexus-data/tmp/.nfs000000000d79dd860000f1ac
lsof: no pwd entry for UID 200
java    50073      200 *882w      REG  0,917   1127757  226090408 /nexus-data/tmp/.nfs000000000d79dda80000f1cf
lsof: no pwd entry for UID 200
java    50073      200 *883w      REG  0,917   6496158  226091169 /nexus-data/tmp/102739330735246238
lsof: no pwd entry for UID 200
java    50073      200 *884w      REG  0,917    769336  226091547 /nexus-data/tmp/.nfs000000000d79e21b0000f373
lsof: no pwd entry for UID 200
java    50073      200 *885w      REG  0,917       488  226090375 /nexus-data/tmp/117278840800635037
lsof: no pwd entry for UID 200
mrprescott commented 1 year ago

Thanks for filing this. Is there anything particular about your workload or environment you could share? Which format(s) are you primarily using?

bom-d-van commented 1 year ago

hi @mrprescott, our workload is quite simple. The issued seems could be easily reproduced by uploading apt artifacts (deb packages) using the official docker image:

cdbear07 commented 1 year ago

Hey @mrprescott , I can replicate this error simply by uploading apt packages as well. This can be an annoying issue when using file systems with limits on how many files can be open at once time, like AWS EFS. The files can be closed by restarting Nexus, but this bug practically guarantees that a restart is needed every week.

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 60 days with no activity.

bom-d-van commented 4 months ago

this issue needs some love from our maintainers. (thanks for the open sourcing effort nonetheless, I'm not a java guy, otherwise it's an rather interesting problem to fix.) :(

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 60 days with no activity.