dCache / dcache

dCache - a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods
https://dcache.org
288 stars 136 forks source link

JAVA heap memory limits not respected after 9.2.6 upgrade #7507

Open cfgamboa opened 8 months ago

cfgamboa commented 8 months ago

Dear all,

An increase 13% of the memory usage on the pools is observed on dCache 9.2.6. This is observed in different pools on the system. Here an example

Screenshot 2024-02-05 at 10 03 14 AM

The Pool is enabled with the following values before and after the upgrade.

dcache.java.memory.heap=6144m
dcache.java.memory.direct=6144m

[${host.name}eightDomain]
[${host.name}eightDomain/pool]
pool.name=${host.name}_8
pool.tags=hostname=BCF
pool.path=/data/dcache_pool_8/pool
pool.wait-for-files=${pool.path}/data

When enabling JMX to monitor the JVM this is observed,

Screenshot 2024-02-05 at 10 28 25 AM

Other OS level tools shows that the actual memory usage is also higher than what before the 9.2.6 (8.2.X)

It is not clear to me why the limits are not respected. Please note that there are not relevant changes on the OS the java version is the same JAVA 11.

All the best, Carlos

DmitryLitvintsev commented 8 months ago

Carlos,

Please check domain log. On startup it prints parameters passed to JVM:

2023-05-02 19:49:03 Launching /usr/bin/java -server -Xmx512m -XX:MaxDirectMemorySize=512m ....

Do you see your setting 6144m passed there?

cfgamboa commented 8 months ago

Hello,

Indeed.

024-02-05 10:28:48 Launching /usr/bin/java -server -Xmx6144m -XX:MaxDirectMemorySize=6144m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=20000,25000 -Dorg.dcache.dcap.port=0 -Dorg.dcache.ftp.log-aborted-transfers=true -Dorg.dcache.net.tcp.portrange=33115:33315 -Djava.security.krb5.realm= -Djava.security.krb5.kdc= -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Dcontent.types.user.table=/etc/dcache/content-types.properties -Dzookeeper.sasl.client=false -Dcurator-dont-log-connection-problems=true -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/dc235eightDomain-oom.hprof -XX:+ExitOnOutOfMemoryError -XX:+StartAttachListener -XX:+UseCompressedOops -javaagent:/usr/share/dcache/classes/aspectjweaver-1.9.9.1.jar -Djava.net.preferIPv6Addr

On Feb 5, 2024, at 10:44 AM, Dmitry Litvintsev @.***> wrote:

Carlos,

Please check domain log. On startup it prints parameters passed to JVM:

2023-05-02 19:49:03 Launching /usr/bin/java -server -Xmx512m -XX:MaxDirectMemorySize=512m .... Do you see your setting 6144m passed there?

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7507#issuecomment-1927293020, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO264YDOFD4X3V6FPOLYSD47PAVCNFSM6AAAAABC2LOPEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRXGI4TGMBSGA. You are receiving this because you authored the thread.

cfgamboa commented 7 months ago

FYI for security issues the RT ticket [www.dcache.org #10590] Re: JAVA heap memory limits not respected after 9.2.6 upgrade GitHub issue #7507 is used now as there are java related dumps requested by dCache devs

DmitryLitvintsev commented 7 months ago

Hi Carlos,

following up on this:

  1. When you say "JAVA heap memory limits not respected", do you mean that JVM uses more memory than specified by -Xmx6144m -XX:MaxDirectMemorySize=6144m ?
  2. Or that pool memory usage is up 10% ?

which is it (1) or (2) or both. Or both? If it is (1) then this is must be JAVA issue. If it is (2) this must be the consequence of creating transfer buffers on startup.

cfgamboa commented 7 months ago

Yes, is more than the specified. See an example here for pools running

Screenshot 2024-03-06 at 9 13 24 AM

See the process

root     271305  0.0  0.0 112808   976 pts/0    S+   09:16   0:00          \_ grep --color=auto 52204
root      52204 14.9  2.8 49247004 11100572 ?   Ssl  Feb09 5496:11 /usr/bin/java -server -Xmx6144m -XX:MaxDirectMemorySize=6144m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=20000,25000 -Dorg.dcache.dcap.port=0 -Dorg.dcache.ftp.log-aborted-transfers=true -Dorg.dcache.net.tcp.portrange=33115:33315 -Djava.security.krb5.realm= -Djava.security.krb5.kdc= -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Dcontent.types.user.table=/etc/dcache/content-types.properties -Dzookeeper.sasl.client=false -Dcurator-dont-log-connection-problems=true -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/dc241eightDomain-oom.hprof -XX:+ExitOnOutOfMemoryError -XX:+StartAttachListener -XX:+UseCompressedOops -javaagent:/usr/share/dcache/classes/aspectjweaver-1.9.9.1.jar -Djava.net.preferIPv6Addresses=system --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.sql/java.sql=ALL-UNNAMED --add-opens=java.base/java.math=ALL-UNNAMED --add-opens java.base/sun.nio.fs=ALL-UNNAMED -Djava.awt.headless=true -DwantLog4jSetup=n -Dorg.bouncycastle.dh.allow_unsafe_p_value=true -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start dc241eightDomain
kofemann commented 7 months ago

The issue might be addressed by commit 8f3b984f

cfgamboa commented 7 months ago

Hello,

Thank you. 9.2.14 was deployed on pools of the integration instance

Screenshot 2024-03-08 at 1 23 57 PM

So far it looks promising.

lemora commented 7 months ago

Hi Carlos. Interestingly, the potential fix mentioned by Tigran is not yet in 9.2.14, it still needs to be merged.

cfgamboa commented 7 months ago

Hi Thank you for clarifying.

For now let us see how the memory evolves overtime.

All the best, Carlos

On Mar 8, 2024, at 3:23 PM, Lea @.***> wrote:

Hi Carlos. Interestingly, the potential fix mentioned by Tigran is not yet in 9.2.14, it still needs to be merged.

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7507#issuecomment-1986366538, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO4DTOMOTTFYWDOWM7LYXIM3ZAVCNFSM6AAAAABC2LOPEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBWGM3DMNJTHA. You are receiving this because you authored the thread.

cfgamboa commented 6 months ago

I took one of the recently decommissioned pool host and upgrade it to 9.2.15 from 9.2.6.

It looks better

image
cfgamboa commented 5 months ago

Upgraded a production host with 15 pools with dcache 9.2.15

image
kofemann commented 3 months ago

@cfgamboa Is there any updates on this? Do you still have this issue, and did pools fail with Out-of-Memmroy exceptions?

cfgamboa commented 3 months ago

Hello,

No the pools did not crashed with Out of Memory executions not observed so far.

Nothing has changed to what is already reported.

All the best, Carlos

On Jun 25, 2024, at 10:41 AM, Tiramisu Mokka @.***> wrote:

@cfgamboa https://github.com/cfgamboa Is there any updates on this? Do you still have this issue, and did pools fail with Out-of-Memmroy exceptions?

— Reply to this email directly, view it on GitHub https://github.com/dCache/dcache/issues/7507#issuecomment-2189150582, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHIHMO5WX6XLEAXX2L2GAX3ZJF6Q7AVCNFSM6AAAAABC2LOPEGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBZGE2TANJYGI. You are receiving this because you were mentioned.

DmitryLitvintsev commented 3 months ago

If pools don't crash, why is the issue?

cfgamboa commented 3 months ago

The issue is the high increase on memory usage after deploying 9.2. There is a 13% increase of memory as it was initially reported at the beginning of the ticket. Since then In some cases we have observed less than 3% free memory available to the system.

While the pools are not crashing the memory used by dCache was substantially increased after the upgrade. This is https://github.com/dCache/dcache/issues/7507#issuecomment-1980971444 an example of the memory used on the server and the memory allocated enable in the process.

All the best, Carlos

lemora commented 3 months ago

Hi Carlos.

Following up from the Tier 1, is it fair to say that you would be satisfied with a specific reason on our side why the required pool memory has increased? And that the reason for your concern are a matter of storage budget, meaning that you want us to acknowledge that we know of the change? Of course it would be preferable to all if the memory increase went away, but the situation is difficult to analyse and is unlikely to be resolved with a quick fix.

Kind regards Lea

cfgamboa commented 3 months ago

Hi Lea,

If the cause of the increase in memory usage is not known yet and is under review by you, this is fair enough; at least I know you are aware of it. If this is expected, then of course we will need to take it into account for future requisitions.

Currently, the criticality is not high as the service is not disrupted by out-of-memory errors.

All the best, Carlos

vingar commented 3 months ago

With these parameters

dcache.java.memory.heap=6144m
dcache.java.memory.direct=6144m

The maximum memory usage for a Java process is the sum of the two, I think?:

6144 MB (heap)+6144 MB (direct)=12288 MB=12.288 GB

One of our data servers shows the following per pool process:

Domain                Memory usage
dc268_1-Domain        13.047431945800781 GB
dc268_10-Domain       3.1094741821289062 GB
dc268_11-Domain       1.9457817077636719 GB
dc268_12-Domain       3.3018569946289062 GB
dc268_2-Domain        12.515079498291016 GB
dc268_3-Domain        13.274280548095703 GB
dc268_4-Domain        12.742572784423828 GB
dc268_5-Domain        12.708694458007812 GB
dc268_6-Domain        12.807502746582031 GB
dc268_7-Domain        11.783977508544922 GB
dc268_8-Domain        12.699535369873047 GB
dc268_9-Domain        12.95401382446289 GB

In some cases, the memory usage exceeds the value specified by the combined heap and direct memory settings. This is what Carlo reported, as it was not the case before 9.* release series.

The memory usage is important for resource provisioning. More memory seems to be needed for next purchase

DmitryLitvintsev commented 3 months ago

With these parameters

dcache.java.memory.heap=6144m
dcache.java.memory.direct=6144m

The maximum memory usage for a Java process is the sum of the two, I think?:

6144 MB (heap)+6144 MB (direct)=12288 MB=12.288 GB

One of our data servers shows the following per pool process:

Domain                Memory usage
dc268_1-Domain        13.047431945800781 GB
dc268_10-Domain       3.1094741821289062 GB
dc268_11-Domain       1.9457817077636719 GB
dc268_12-Domain       3.3018569946289062 GB
dc268_2-Domain        12.515079498291016 GB
dc268_3-Domain        13.274280548095703 GB
dc268_4-Domain        12.742572784423828 GB
dc268_5-Domain        12.708694458007812 GB
dc268_6-Domain        12.807502746582031 GB
dc268_7-Domain        11.783977508544922 GB
dc268_8-Domain        12.699535369873047 GB
dc268_9-Domain        12.95401382446289 GB

In some cases, the memory usage exceeds the value specified by the combined heap and direct memory settings. This is what Carlo reported, as it was not the case before 9.* release series.

The memory usage is important for resource provisioning. More memory seems to be needed for next purchase

This is more complicated , see : https://stackoverflow.com/questions/15282178/java-using-up-far-more-memory-than-allocated-with-xmx

You seem to be concerned "what if sum (memory used by all pools) reaches 100%". And how much memory you need for provisioning in the future.

First concern, can you allocate memory to pools so the sum of it is exactly physical node memory size? And then run like that and monitor swap usage. Do you have swap activity.

Second, for instance at Fermilab, the pool nodes we buy have more memory than we ever use:

[root@pubstor2301 ~]# free
               total        used        free      shared  buff/cache   available
Mem:       263150780    36593124     2243884       26120   228000980   226557656
Swap:        4194300       26880     4167420

And when the pool nodes are bought we get "whatever is similar to what we have, but current gen."

Just real quick look

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                           
1134414 root      20   0   39.3g   6.6g  13416 S 250.0   2.6  46586:59 java                                                                                              
1134551 root      20   0   26.0g   5.1g  15412 S 181.2   2.0  13654:32 java                                                                                              
1134287 root      20   0   39.0g   6.3g  17628 S  12.5   2.5 434:35.21 java                                                                                                                                                                             
1134348 root      20   0   18.6g   2.1g  18384 S   6.2   0.8 202:57.73 java                                                                                              

Each pool has:

dcache.java.memory.heap=4096m
dcache.java.memory.direct=4096m

Are you speaking about virtual memory or resident? As we can see above RS exceeds 4GB by 50% in some cases. Does it create any problems? Does not look like.

Memory has never been a concern on our installation. From time to time we run out of java.memory.heap due to number of files and we typically double java.memory.heap when this happens.

vingar commented 3 months ago

This is more complicated , see : https://stackoverflow.com/questions/15282178/java-using-up-far-more-memory-than-allocated-with-xmx

Thanks

You seem to be concerned "what if sum (memory used by all pools) reaches 100%". And how much memory you need for provisioning in the future.

Yes, the increase of memory with the latest release triggered some discussion about the configuration and the specification of the pool server.

First concern, can you allocate memory to pools so the sum of it is exactly physical node memory size? And then run like that and monitor swap usage. Do you have swap activity.

Second, for instance at Fermilab, the pool nodes we buy have more memory than we ever use:

[root@pubstor2301 ~]# free
               total        used        free      shared  buff/cache   available
Mem:       263150780    36593124     2243884       26120   228000980   226557656
Swap:        4194300       26880     4167420

Fore the mentioned host, I have:

[root@dc268 ~]# free
              total        used        free      shared  buff/cache   available
Mem:      263009792   245740016    10360080     3972192     6909696    10684272
Swap:       8388604     8019584      369020

And when the pool nodes are bought we get "whatever is similar to what we have, but current gen."

Just real quick look

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                           
1134414 root      20   0   39.3g   6.6g  13416 S 250.0   2.6  46586:59 java                                                                                              
1134551 root      20   0   26.0g   5.1g  15412 S 181.2   2.0  13654:32 java                                                                                              
1134287 root      20   0   39.0g   6.3g  17628 S  12.5   2.5 434:35.21 java                                                                                                                                                                             
1134348 root      20   0   18.6g   2.1g  18384 S   6.2   0.8 202:57.73 java                                                                                              

Each pool has:

dcache.java.memory.heap=4096m
dcache.java.memory.direct=4096m

Are you speaking about virtual memory or resident? As we can see above RS exceeds 4GB by 50% in some cases. Does it create any problems? Does not look like.

Correct. The pools run without errors

Memory has never been a concern on our installation. From time to time we run out of java.memory.heap due to number of files and we typically double java.memory.heap when this happens.

I don't think we have observed this

DmitryLitvintsev commented 3 months ago

Also!

The numbers I quoted were for our public system that runs 8.2! So even if 8.2 resident memory > sum(heap+direct)

On CMS that runs 9.2.11 (plus some patches - I do not wait for official releases sometime):

root@cmsstor925 ~]# free
               total        used        free      shared  buff/cache   available
Mem:       195994608    44024360     1993640        5008   152738004   151970248
Swap:        4194300           0     4194300
DOMAIN                        STATUS                PID   USER LOG                                               
w-cmsstor925-disk-disk1Domain running (for 4 weeks) 25630 root /var/log/dcache/w-cmsstor925-disk-disk1Domain.log 
w-cmsstor925-disk-disk2Domain running (for 4 weeks) 25640 root /var/log/dcache/w-cmsstor925-disk-disk2Domain.log 
w-cmsstor925-disk-disk3Domain running (for 4 weeks) 25651 root /var/log/dcache/w-cmsstor925-disk-disk3Domain.log 
w-cmsstor925-disk-disk4Domain running (for 4 weeks) 25683 root /var/log/dcache/w-cmsstor925-disk-disk4Domain.log 
[root@cmsstor925 ~]#  ps -o pid,vsz,rss,size -p 25630,25640,25651,25683
  PID    VSZ   RSS  SIZE
25630 28873688 9177608 10473540
25640 28537760 8512420 9990132
25651 28563492 7310116 8837064
25683 28612632 8741532 10224020
dcache.java.memory.heap=5096m
dcache.java.memory.direct=5096m

All nodes (on public in the previous message) and thos one run Alma9. So as you can see : 8.2 and 9.2 exhibit the same pattern. Is it a problem? I do not think so.

P.S.: what does strike me as curious. CMS shows 0 swap activity as it should. Why public shows >0 swap.... have no idea. It should not be swapping at all.

cfgamboa commented 3 months ago

@DmitryLitvintsev thank you for your feedback. what is the java version run there

DmitryLitvintsev commented 3 months ago

@vingar

your quoted

[root@dc268 ~]# free
              total        used        free      shared  buff/cache   available
Mem:      263009792   245740016    10360080     3972192     6909696    10684272
Swap:       8388604     8019584      369020

This actually does not look right to me. Used 234 GB out of 256 (in our case only 34GB is used out of 250) The sum of memory sizes of pool processes is 122.89 GB. Where is another 110GB went?

In our case 34GB is roughly sum(java process resident size)

DmitryLitvintsev commented 3 months ago

@DmitryLitvintsev thank you for your feedback. what is the java version run there

[root@cmsstor925 ~]# java -version 
openjdk version "11.0.23" 2024-04-16 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.23.0.9-2) (build 11.0.23+9-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.23.0.9-2) (build 11.0.23+9-LTS, mixed mode, sharing)
[root@cmsstor925 ~]# 

and in Public:

[root@pubstor2301 ~]# java -version
openjdk version "11.0.22" 2024-01-16 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.22.0.7-1) (build 11.0.22+7-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.22.0.7-1) (build 11.0.22+7-LTS, mixed mode, sharing)
[root@pubstor2301 ~]# 
cfgamboa commented 3 months ago

This image shows a server with dCache running and then stopped.

image
cfgamboa commented 3 months ago

Our java version is the same

[root@dc264 ~]# java -version
openjdk version "11.0.22" 2024-01-16 LTS
OpenJDK Runtime Environment (Red_Hat-11.0.22.0.7-1) (build 11.0.22+7-LTS)
OpenJDK 64-Bit Server VM (Red_Hat-11.0.22.0.7-1) (build 11.0.22+7-LTS, mixed mode, sharing)
cfgamboa commented 3 months ago

It seems that after the upgrade the system is using the full dcache.java.memory.heap=6144m dcache.java.memory.direct=6144m

which is reported at aggregated 12.2GB

I looked into other systems less loaded running in 9.2 and the memory the RES 2.7g dcache.java.memory.heap=2048m dcache.java.memory.direct=2048m

root      8782  1.2  0.5 16994924 2846512 ?    Ssl  Jun02 435:30 /usr/bin/java -server -Xmx2048m -XX:MaxDirectMemorySize=2048m -Dsun.net.inetaddr.ttl=1800 -Dorg.globus.tcp.port.range=20000,50000 -Dorg.dcache.dcap.port=0 -Dorg.dcache.ftp.log-aborted-transfers=true -Dorg.dcache.net.tcp.portrange=33115:33315 -Djava.security.krb5.realm= -Djava.security.krb5.kdc= -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/etc/dcache/jgss.conf -Dcontent.types.user.table=/etc/dcache/content-types.properties -Dzookeeper.sasl.client=false -Dcurator-dont-log-connection-problems=true -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/var/log/dcache/dcqos002nineDomain-oom.hprof -XX:+ExitOnOutOfMemoryError -XX:+StartAttachListener -XX:+UseCompressedOops -javaagent:/usr/share/dcache/classes/aspectjweaver-1.9.9.1.jar -Djava.net.preferIPv6Addresses=system --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.text=ALL-UNNAMED --add-opens=java.sql/java.sql=ALL-UNNAMED --add-opens=java.base/java.math=ALL-UNNAMED --add-opens java.base/sun.nio.fs=ALL-UNNAMED -Djava.awt.headless=true -DwantLog4jSetup=n -Dorg.bouncycastle.dh.allow_unsafe_p_value=true -Ddcache.home=/usr/share/dcache -Ddcache.paths.defaults=/usr/share/dcache/defaults org.dcache.boot.BootLoader start dcqos002nineDomain
You have new mail in /var/spool/mail/root
[root@dcqos002 ~]# top -p 8782

top - 20:33:22 up 568 days,  5:53, 23 users,  load average: 0.39, 0.33, 0.46
Tasks:   1 total,   0 running,   1 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.4 us,  0.2 sy,  0.0 ni, 99.4 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 52808425+total,  1738380 free, 76338176 used, 45000768+buff/cache
KiB Swap:  8388604 total,  7877304 free,   511300 used. 45033398+avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                                                                 
 8782 root      20   0   16.2g   2.7g  23628 S   0.7  0.5 435:31.31 java                                                                                                                    

Even though the MEMORY allowed is 2G for HEAP and Direct is 2gb the system does not seem to use all.

The example reported here the system is using the 12.2GB what is not clear is why before the upgrade it did not have to use it all.....

vingar commented 3 months ago

@vingar

your quoted

[root@dc268 ~]# free
              total        used        free      shared  buff/cache   available
Mem:      263009792   245740016    10360080     3972192     6909696    10684272
Swap:       8388604     8019584      369020

This actually does not look right to me. Used 234 GB out of 256 (in our case only 34GB is used out of 250) The sum of memory sizes of pool processes is 122.89 GB. Where is another 110GB went?

In our case 34GB is roughly sum(java process resident size)

@DmitryLitvintsev you got a good point there. Where does the rest of the memory go...

# ps -eo comm,rss | awk '{array[$1]+=$2} END {for (i in array) print i, array[i]/1024 " MB"}' | sort -k2 -nr
java 125745 MB
pmdaproc 461.539 MB
tuned 87.9258 MB
systemd-journal 73.4453 MB
pmdalinux 70.8945 MB
..

I'll follow up at BNL. Thanks for spotting this