neo4j / neo4j

Graphs for Everyone
http://neo4j.com
GNU General Public License v3.0
13.12k stars 2.37k forks source link

ERROR: Unable to create the log scanner for CDC #13493

Open zirkelc opened 1 month ago

zirkelc commented 1 month ago

I'm running Neo4j v5.17.0 via Docker on Ubuntu 22.04.4 LTS.

I'm using the CDC feature and found the following errors in the debug logs:

2024-07-24 00:19:01.643+0000 ERROR [c.n.c.CDCService] [fashionbeauty19qnngn/0a214b04] Unable to create the log scanner for CDC

This error is preceded by lots of warnings like these:

2024-07-23 13:23:37.530+0000 WARN  [o.n.k.i.t.l.f.c.CheckpointFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Error on attempt to preallocate log file version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-23 13:23:37.540+0000 WARN  [o.n.k.i.t.l.f.c.CheckpointFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Unable to advise sequential access for transaction log version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-23 13:23:37.540+0000 WARN  [o.n.k.i.t.l.f.c.CheckpointFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Unable to advise preserve data in cache for transaction log version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-23 13:23:37.541+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Unable to advise sequential access for transaction log version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-23 13:23:37.541+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Unable to advise preserve data in cache for transaction log version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-23 13:23:37.541+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Unable to evict transaction log from cache with version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
...
2024-07-23 15:59:08.013+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Error on attempt to preallocate log file version: 11. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-23 15:59:08.030+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [fashionbeauty19qnngn/0a214b04] Unable to evict transaction log from cache with version: 10. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'

I'm not sure if the warning are related to the CDC issue, but it looks like something broke the CDC service, maybe because the transactions logs could not be created.

Here is the full debug log filter for the database fashionbeauty19qnngn as gist, because it's too big for this issue: https://gist.github.com/zirkelc/8e74a6830e37d802778a2e41f22184d7

neo-ionut commented 1 month ago

Hi @zirkelc, thank you for reporting this. In order to help us understand what is happening, can you provide us with some additional information:

From the logs, it seems we are failing to preallocate a transaction log for the database because we cannot fetch the fileDescriptor. There can be multiple reasons for this, most of then OS/File System/JVM related. Hopefully the answers to the previous questions would help us pinpoint where this is coming from.

zirkelc commented 1 month ago

Hi @neo-ionut

What version of JDK are you using?

openjdk 17.0.10 2024-01-16
OpenJDK Runtime Environment Temurin-17.0.10+7 (build 17.0.10+7)
OpenJDK 64-Bit Server VM Temurin-17.0.10+7 (build 17.0.10+7, mixed mode, sharing)

Do you pass any custom arguments to the jvm when starting Neo4j?

No, I start it via Neo4j via Docker

Can you share your config file used? If you cannot, can you share the part of the config that starts with server.jvm.additional=

neo4j.conf

server.jvm.additional=-XX:+ExitOnOutOfMemoryError
server.jvm.additional=--add-opens=java.base/java.nio=ALL-UNNAMED
server.jvm.additional=--add-opens=java.base/java.io=ALL-UNNAMED
server.jvm.additional=--add-opens=java.base/sun.nio.ch=ALL-UNNAMED
server.memory.heap.initial_size=3500m 
server.memory.heap.max_size=3500m           
server.memory.pagecache.size=1800m      
db.tx_log.rotation.retention_policy=3 days 5G

Lastly, do you notice the warning Error on attempt to preallocate log file version for other databases? Yes, it happens for other databases but not consistently. Here are the logs for the last 7 days:

root@nebula-neo4j:/neo4j/data# docker exec --interactive --tty neo4j awk -v start="$(date --date='7 days ago' +'%Y-%m-%d %H:%M:%S')" -v end="$(date +'%Y-%m-%d %H:%M:%S')" '$0 >= start && $0 <= end' /logs/debug.log | grep -i 'Error on attempt to preallocate log file version'
2024-07-30 00:17:01.504+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [pullupdipqkfh4/9b319e20] Error on attempt to preallocate log file version: 18. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-30 08:27:14.934+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [veynoubmf6z/68073522] Error on attempt to preallocate log file version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-30 08:27:14.946+0000 WARN  [o.n.k.i.t.l.f.c.CheckpointFileChannelNativeAccessor] [veynoubmf6z/68073522] Error on attempt to preallocate log file version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-31 00:13:19.396+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [treazyxvjjp/4ab16d5b] Error on attempt to preallocate log file version: 1. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-31 00:14:17.814+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [doaba/7d4fa424] Error on attempt to preallocate log file version: 4. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-07-31 00:14:35.694+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [rc4lh/5579951b] Error on attempt to preallocate log file version: 9. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-08-02 07:41:06.442+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [pullupdipbgrti/ac159bbf] Error on attempt to preallocate log file version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-08-02 07:41:06.455+0000 WARN  [o.n.k.i.t.l.f.c.CheckpointFileChannelNativeAccessor] [pullupdipbgrti/ac159bbf] Error on attempt to preallocate log file version: 0. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-08-02 07:53:51.834+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [pullupdipbgrti/ac159bbf] Error on attempt to preallocate log file version: 1. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-08-02 07:57:40.729+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [pullupdipbgrti/ac159bbf] Error on attempt to preallocate log file version: 2. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-08-02 08:00:07.780+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [pullupdipbgrti/ac159bbf] Error on attempt to preallocate log file version: 3. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'
2024-08-05 00:11:15.713+0000 WARN  [o.n.k.i.t.l.f.LogFileChannelNativeAccessor] [treazyxvjjp/4ab16d5b] Error on attempt to preallocate log file version: 2. Error: ErrorCode=-1, errorMessage='Incorrect file descriptor.'

After reading this article from Neo4j, I added these three flags:

server.jvm.additional=--add-opens=java.base/java.nio=ALL-UNNAMED
server.jvm.additional=--add-opens=java.base/[java.io](http://java.io/)=ALL-UNNAMED 
server.jvm.additional=--add-opens=java.base/[sun.nio.ch](http://sun.nio.ch/)=ALL-UNNAMED

However, it doesn't look like that it's helping regarding the warnings and errors.

When I looked through the /neo4j/data/transactions/* files, I noticed that there are many older neostore.transaction.db.* files For example for the database fashionbeauty19qnngn there are files from

transactions/fashionbeauty19qnngn:
total 458892
drwxr-xr-x  2 7474 7474      4096 Jul 27 00:09 .
drwxr-xr-x 25 7474 7474      4096 Aug  5 09:51 ..
-rw-r--r--  1 7474 7474      5928 Aug  5 06:56 checkpoint.0
-rw-r--r--  1 7474 7474 268646770 Jul 23 16:23 neostore.transaction.db.13
-rw-r--r--  1 7474 7474 201228573 Jul 28 00:16 neostore.transaction.db.14

If I'm not mistaken, shouldn't the transactions log files be deleted according to db.tx_log.rotation.retention_policy=3 days 5G?

neo-ionut commented 1 month ago

This is indeed quite strange. From the looks of it, we are having issue preallocateing the log files, but it seems that this is not for everything, only a few cases? My suspicion was that we can't use sun.nio, but the config looks ok. Can you maybe check the beginning of the debug log to see if the settings actually get passed correctly? You should be able to find all the server.jvm.additional.

Additionally, do you think it would be possible to try to enable some additional debug information? We have a flag that enables extra prints for reflection issues. All you need to do is add the following to the config and restart. And then check if exceptions are being printed in the log server.jvm.additional=-Dorg.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions=true

I also am a bit unsure what is happening with the deletion. Indeed the policy should have deleted the older file. Can you confirm that a checkpoint triggered after the second file was created? (log files are pruned on checkpointing).

Last thing - I am not sure how you mount volumes in Docker, but maybe there is something around how Docker mounts disks or some permissions that affects this? (a bit of a shot in the dark)

zirkelc commented 1 month ago

I added server.jvm.additional=-Dorg.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions=true to the neo4j.conf and restarted it.

Here are the diagnostics from the beginning of debug.log ``` ******************************************************************************** [ System diagnostics ] ******************************************************************************** -------------------------------------------------------------------------------- [ System memory information ] -------------------------------------------------------------------------------- Total Physical memory: 15.24GiB Free Physical memory: 5.743GiB Committed virtual memory: 9.517GiB Total swap space: 0B Free swap space: 0B -------------------------------------------------------------------------------- [ JVM memory information ] -------------------------------------------------------------------------------- Free memory: 2.608GiB Total memory: 3.418GiB Max memory: 3.418GiB Garbage Collector: G1 Young Generation: [G1 Eden Space, G1 Survivor Space, G1 Old Gen] Garbage Collector: G1 Old Generation: [G1 Eden Space, G1 Survivor Space, G1 Old Gen] Memory Pool: CodeHeap 'non-nmethods' (Non-heap memory): committed=2.438MiB, used=1.892MiB, max=5.570MiB, threshold=0B Memory Pool: Metaspace (Non-heap memory): committed=166.3MiB, used=163.9MiB, max=-1B, threshold=0B Memory Pool: CodeHeap 'profiled nmethods' (Non-heap memory): committed=17.94MiB, used=17.91MiB, max=117.2MiB, threshold=0B Memory Pool: Compressed Class Space (Non-heap memory): committed=23.31MiB, used=22.31MiB, max=1.000GiB, threshold=0B Memory Pool: G1 Eden Space (Heap memory): committed=2.152GiB, used=752.0MiB, max=-1B, threshold=? Memory Pool: G1 Old Gen (Heap memory): committed=1.264GiB, used=71.97MiB, max=3.418GiB, threshold=0B Memory Pool: G1 Survivor Space (Heap memory): committed=2.000MiB, used=1.629MiB, max=-1B, threshold=? Memory Pool: CodeHeap 'non-profiled nmethods' (Non-heap memory): committed=5.875MiB, used=5.846MiB, max=117.2MiB, threshold=0B -------------------------------------------------------------------------------- [ Operating system information ] -------------------------------------------------------------------------------- Operating System: Linux; version: 5.15.0-117-generic; arch: amd64; cpus: 8 Max number of file descriptors: 1048576 Number of open file descriptors: 1496 Process id: 7 Byte order: LITTLE_ENDIAN Local timezone: Etc/UTC Memory page size: 4096 Unaligned memory access allowed: true -------------------------------------------------------------------------------- [ JVM information ] -------------------------------------------------------------------------------- VM Name: OpenJDK 64-Bit Server VM VM Vendor: Eclipse Adoptium VM Version: 17.0.10+7 JIT compiler: HotSpot 64-Bit Tiered Compilers VM Arguments: [-XX:+ExitOnOutOfMemoryError, --add-opens=java.base/java.nio=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, -Dorg.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions=true, -Dfile.encoding=UTF-8, -Xms3584000k, -Xmx3584000k] -------------------------------------------------------------------------------- [ Java classpath ] -------------------------------------------------------------------------------- [classpath] /var/lib/neo4j/lib/neo4j-causal-clustering-5.17.0.jar [classpath] /var/lib/neo4j/lib/asm-util-9.6.jar [classpath] /var/lib/neo4j/lib/commons-compress-1.25.0.jar [classpath] /var/lib/neo4j/lib/shiro-cache-1.13.0.jar [classpath] /var/lib/neo4j/lib/jetty-security-10.0.17.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-classes-2.0.61.Final.jar [classpath] /var/lib/neo4j/lib/arrow-memory-core-14.0.2.jar [classpath] /var/lib/neo4j/lib/neo4j-csv-5.17.0.jar [classpath] /var/lib/neo4j/lib/jaxb-runtime-2.3.2.jar [classpath] /var/lib/neo4j/lib/guava-33.0.0-jre.jar [classpath] /var/lib/neo4j/lib/neo4j-raft-common-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-logging-5.17.0.jar [classpath] /var/lib/neo4j/lib/checksums-spi-2.22.11.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-boringssl-static-2.0.61.Final-windows-x86_64.jar [classpath] /var/lib/neo4j/lib/asm-tree-9.6.jar [classpath] /var/lib/neo4j/lib/neo4j-bootcheck-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-handler-proxy-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/jersey-container-servlet-2.34.jar [classpath] /var/lib/neo4j/lib/caffeine-3.1.8.jar [classpath] /var/lib/neo4j/lib/magnolia_2.13-1.1.8.jar [classpath] /var/lib/neo4j/lib/scala-library-2.13.11.jar [classpath] /var/lib/neo4j/lib/shiro-core-1.13.0.jar [classpath] /var/lib/neo4j/lib/neo4j-id-generator-5.17.0.jar [classpath] /var/lib/neo4j/lib/jetty-alpn-client-10.0.17.jar [classpath] /var/lib/neo4j/lib/reactor-core-3.6.1.jar [classpath] /var/lib/neo4j/lib/akka-pki_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/neo4j-notifications-5.17.0.jar [classpath] /var/lib/neo4j/conf/* [classpath] /var/lib/neo4j/lib/crt-core-2.22.11.jar [classpath] /var/lib/neo4j/lib/WMI4Java-1.6.3.jar [classpath] /var/lib/neo4j/lib/neo4j-protocol-consensus-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-block-storage-engine-5.17.0.jar [classpath] /plugins/aws-java-sdk-s3-1.12.136.jar [classpath] /var/lib/neo4j/lib/netty-transport-native-unix-common-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/jakarta.xml.bind-api-2.3.2.jar [classpath] /var/lib/neo4j/lib/jProcesses-1.6.5.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-boringssl-static-2.0.61.Final.jar [classpath] /var/lib/neo4j/lib/hk2-locator-2.6.1.jar [classpath] /var/lib/neo4j/lib/scala-parser-combinators_2.13-1.1.2.jar [classpath] /var/lib/neo4j/lib/zstd-proxy-5.17.0.jar [classpath] /var/lib/neo4j/lib/simpleclient_tracer_common-0.16.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-dsl-2023.9.1.jar [classpath] /var/lib/neo4j/lib/aws-query-protocol-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-javacc-parser-5.17.0.jar [classpath] /var/lib/neo4j/lib/simpleclient-0.16.0.jar [classpath] /var/lib/neo4j/lib/neo4j-dbms-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/eclipse-collections-api-11.1.0.jar [classpath] /var/lib/neo4j/lib/neo4j-server-enterprise-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-graphdb-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/jersey-container-servlet-core-2.34.jar [classpath] /var/lib/neo4j/lib/profiles-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-monitoring-5.17.0.jar [classpath] /var/lib/neo4j/lib/jansi-2.4.0.jar [classpath] /var/lib/neo4j/lib/crdt-library-5.17.0.jar [classpath] /var/lib/neo4j/lib/jetty-http-10.0.17.jar [classpath] /var/lib/neo4j/lib/third-party-jackson-core-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-query-router-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-transport-classes-epoll-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/http-auth-spi-2.22.11.jar [classpath] /var/lib/neo4j/lib/ipaddress-5.4.0.jar [classpath] /var/lib/neo4j/lib/shiro-config-core-1.13.0.jar [classpath] /var/lib/neo4j/lib/jline-terminal-3.21.0.jar [classpath] /var/lib/neo4j/lib/neo4j-command-line-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-discovery-lighthouse-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-runtime-util-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-cache-5.17.0.jar [classpath] /var/lib/neo4j/lib/lz4-java-1.8.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cloud-5.17.0.jar [classpath] /var/lib/neo4j/lib/asm-analysis-9.6.jar [classpath] /var/lib/neo4j/lib/commons-logging-1.3.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-rendering-5.17.0.jar [classpath] /var/lib/neo4j/lib/httpcore-4.4.13.jar [classpath] /var/lib/neo4j/lib/config-1.4.2.jar [classpath] /var/lib/neo4j/lib/netty-transport-native-kqueue-4.1.101.Final-osx-x86_64.jar [classpath] /var/lib/neo4j/lib/jakarta.ws.rs-api-2.1.6.jar [classpath] /var/lib/neo4j/lib/kiama_2.13-2.5.1.jar [classpath] /var/lib/neo4j/lib/neo4j-configuration-5.17.0.jar [classpath] /var/lib/neo4j/lib/grpc-protobuf-lite-1.60.1.jar [classpath] /var/lib/neo4j/lib/flight-core-14.0.2.jar [classpath] /var/lib/neo4j/lib/neo4j-kernel-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/endpoints-spi-2.22.11.jar [classpath] /var/lib/neo4j/lib/zstd-jni-1.5.5-11.jar [classpath] /plugins/joda-time-2.10.13.jar [classpath] /var/lib/neo4j/lib/netty-handler-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/istack-commons-runtime-3.0.8.jar [classpath] /var/lib/neo4j/lib/neo4j-lock-5.17.0.jar [classpath] /var/lib/neo4j/lib/metrics-jmx-4.2.23.jar [classpath] /var/lib/neo4j/lib/jetty-servlet-api-4.0.6.jar [classpath] /var/lib/neo4j/lib/eclipse-collections-11.1.0.jar [classpath] /var/lib/neo4j/lib/netty-transport-native-kqueue-4.1.101.Final-osx-aarch_64.jar [classpath] /var/lib/neo4j/lib/neo4j-java-driver-5.17.0.jar [classpath] /plugins/httpcore-4.4.15.jar [classpath] /var/lib/neo4j/lib/proto-google-common-protos-2.22.0.jar [classpath] /var/lib/neo4j/lib/neo4j-util-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-logical-plans-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-codec-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/json-utils-2.22.11.jar [classpath] /var/lib/neo4j/lib/asn-one-0.5.0.jar [classpath] /var/lib/neo4j/lib/neo4j-native-5.17.0.jar [classpath] /var/lib/neo4j/lib/grpc-netty-1.60.1.jar [classpath] /var/lib/neo4j/lib/neo4j-browser-5.15.0.jar [classpath] /var/lib/neo4j/lib/neo4j-schema-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-collections-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-query-router-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-token-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/gson-2.10.1.jar [classpath] /var/lib/neo4j/lib/grpc-context-1.60.1.jar [classpath] /var/lib/neo4j/lib/jakarta.validation-api-2.0.2.jar [classpath] /var/lib/neo4j/lib/neo4j-bolt-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-boringssl-static-2.0.61.Final-osx-x86_64.jar [classpath] /var/lib/neo4j/lib/neo4j-wal-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-storage-engine-util-5.17.0.jar [classpath] /var/lib/neo4j/lib/jackson-dataformat-cbor-2.16.1.jar [classpath] /var/lib/neo4j/lib/http2-common-10.0.17.jar [classpath] /var/lib/neo4j/lib/hk2-utils-2.6.1.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-ast-factory-5.17.0.jar [classpath] /var/lib/neo4j/lib/stax-ex-1.8.1.jar [classpath] /var/lib/neo4j/lib/jackson-core-2.16.1.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-kernel-5.17.0.jar [classpath] /var/lib/neo4j/lib/s3-2.22.11.jar [classpath] /var/lib/neo4j/lib/txw2-2.3.2.jar [classpath] /var/lib/neo4j/lib/neo4j-metrics-5.17.0.jar [classpath] /var/lib/neo4j/lib/commons-lang3-3.14.0.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-boringssl-static-2.0.61.Final-osx-aarch_64.jar [classpath] /var/lib/neo4j/lib/log4j-api-2.20.0.jar [classpath] /var/lib/neo4j/lib/lucene-core-9.8.0.jar [classpath] /var/lib/neo4j/lib/sdk-core-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-5.17.0.jar [classpath] /var/lib/neo4j/lib/argparse4j-0.9.0.jar [classpath] /var/lib/neo4j/lib/log4j-core-2.20.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-physical-planning-5.17.0.jar [classpath] /var/lib/neo4j/lib/jackson-jaxrs-base-2.16.1.jar [classpath] /var/lib/neo4j/lib/grpc-api-1.60.1.jar [classpath] /var/lib/neo4j/lib/netty-buffer-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/flatbuffers-java-1.12.0.jar [classpath] /var/lib/neo4j/lib/neo4j-values-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-import-tool-5.17.0.jar [classpath] /var/lib/neo4j/lib/server-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/simpleclient_dropwizard-0.16.0.jar [classpath] /plugins/httpclient-4.5.13.jar [classpath] /var/lib/neo4j/lib/neo4j-slf4j-provider-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-codec-http2-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/agrona-1.20.0.jar [classpath] /var/lib/neo4j/lib/neo4j-protocol-lighthouse-5.17.0.jar [classpath] /var/lib/neo4j/lib/grpc-protobuf-1.60.1.jar [classpath] /var/lib/neo4j/lib/neo4j-backup-5.17.0.jar [classpath] /var/lib/neo4j/lib/annotations-4.1.1.4.jar [classpath] /var/lib/neo4j/lib/aws-core-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-discovery-akka-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-fabric-5.17.0.jar [classpath] /var/lib/neo4j/lib/regions-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-record-storage-engine-5.17.0.jar [classpath] /var/lib/neo4j/lib/httpclient-4.5.13.jar [classpath] /var/lib/neo4j/lib/shiro-crypto-core-1.13.0.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-procedure-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-concurrent-5.17.0.jar [classpath] /var/lib/neo4j/lib/asm-9.6.jar [classpath] /var/lib/neo4j/lib/neo4j-index-5.17.0.jar [classpath] /var/lib/neo4j/lib/http-auth-aws-2.22.11.jar [classpath] /var/lib/neo4j/lib/lucene-backward-codecs-9.8.0.jar [classpath] /var/lib/neo4j/lib/shiro-crypto-hash-1.13.0.jar [classpath] /var/lib/neo4j/lib/protocol-core-2.22.11.jar [classpath] /var/lib/neo4j/lib/scala-collection-contrib_2.13-0.3.0.jar [classpath] /var/lib/neo4j/lib/animal-sniffer-annotations-1.23.jar [classpath] /var/lib/neo4j/lib/http2-server-10.0.17.jar [classpath] /var/lib/neo4j/lib/jetty-alpn-server-10.0.17.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-5.17.0.jar [classpath] /var/lib/neo4j/lib/eventstream-1.0.1.jar [classpath] /var/lib/neo4j/lib/gds-write-service-5.17.0.jar [classpath] /var/lib/neo4j/lib/jersey-client-2.34.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-pipelined-runtime-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-transport-classes-kqueue-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/common-5.17.0.jar [classpath] /var/lib/neo4j/lib/jersey-hk2-2.34.jar [classpath] /plugins/neo4j-flyweight-extension-1.0.58.jar [classpath] /var/lib/neo4j/lib/neo4j-spatial-index-5.17.0.jar [classpath] /var/lib/neo4j/lib/gossip-protocol-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-virtual-database-5.17.0.jar [classpath] /var/lib/neo4j/lib/akka-distributed-data_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/neo4j-store-management-5.17.0.jar [classpath] /var/lib/neo4j/lib/jPowerShell-3.0.jar [classpath] /var/lib/neo4j/lib/aws-crt-0.29.2.jar [classpath] /var/lib/neo4j/lib/reactive-streams-1.0.4.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-boringssl-static-2.0.61.Final-linux-aarch_64.jar [classpath] /var/lib/neo4j/lib/javassist-3.25.0-GA.jar [classpath] /var/lib/neo4j/lib/arrow-format-14.0.2.jar [classpath] /var/lib/neo4j/lib/jackson-annotations-2.16.1.jar [classpath] /var/lib/neo4j/lib/commons-configuration2-2.9.0.jar [classpath] /var/lib/neo4j/lib/jetty-io-10.0.17.jar [classpath] /var/lib/neo4j/lib/error_prone_annotations-2.24.1.jar [classpath] /var/lib/neo4j/lib/aws-crt-client-2.22.11.jar [classpath] /var/lib/neo4j/lib/akka-stream_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/simpleclient_tracer_otel_agent-0.16.0.jar [classpath] /var/lib/neo4j/lib/jetty-xml-10.0.17.jar [classpath] /var/lib/neo4j/lib/picocli-4.7.5.jar [classpath] /var/lib/neo4j/lib/listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar [classpath] /var/lib/neo4j/lib/jetty-servlet-10.0.17.jar [classpath] /var/lib/neo4j/lib/failureaccess-1.0.2.jar [classpath] /var/lib/neo4j/lib/annotations-24.1.0.jar [classpath] /var/lib/neo4j/lib/neo4j-front-end-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-resource-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-common-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/neo4j-dbms-enterprise-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-import-util-5.17.0.jar [classpath] /var/lib/neo4j/lib/akka-coordination_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/neo4j-discovery-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-protocol-raft-5.17.0.jar [classpath] /var/lib/neo4j/lib/jersey-common-2.34.jar [classpath] /var/lib/neo4j/lib/netty-codec-http-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/neo4j-procedure-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-procedure-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-ssl-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-tcnative-boringssl-static-2.0.61.Final-linux-x86_64.jar [classpath] /var/lib/neo4j/lib/neo4j-5.17.0.jar [classpath] /var/lib/neo4j/lib/jackson-module-jaxb-annotations-2.16.1.jar [classpath] /var/lib/neo4j/lib/grpc-stub-1.60.1.jar [classpath] /var/lib/neo4j/lib/akka-protobuf-v3_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/neo4j-diagnostics-5.17.0.jar [classpath] /var/lib/neo4j/lib/jetty-webapp-10.0.17.jar [classpath] /var/lib/neo4j/lib/neo4j-exceptions-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cluster-common-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-transport-native-epoll-4.1.101.Final-linux-x86_64.jar [classpath] /var/lib/neo4j/lib/lighthouse-assembler-5.17.0.jar [classpath] /var/lib/neo4j/lib/protobuf-java-3.23.1.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-planner-5.17.0.jar [classpath] /var/lib/neo4j/lib/metrics-core-4.2.23.jar [classpath] /var/lib/neo4j/lib/jna-5.14.0.jar [classpath] /var/lib/neo4j/lib/neo4j-security-enterprise-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-record-storage-engine-5.17.0.jar [classpath] /var/lib/neo4j/lib/lucene-analysis-common-9.8.0.jar [classpath] /var/lib/neo4j/lib/s3-transfer-manager-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-interpreted-runtime-5.17.0.jar [classpath] /var/lib/neo4j/lib/commons-text-1.11.0.jar [classpath] /var/lib/neo4j/lib/jackson-datatype-jsr310-2.15.1.jar [classpath] /var/lib/neo4j/lib/cypher-literal-interpreter-5.17.0.jar [classpath] /var/lib/neo4j/lib/cdc-5.17.0.jar [classpath] /var/lib/neo4j/lib/http-auth-2.22.11.jar [classpath] /var/lib/neo4j/lib/jctools-core-4.0.2.jar [classpath] /var/lib/neo4j/lib/jline-terminal-jansi-3.21.0.jar [classpath] /var/lib/neo4j/lib/hk2-api-2.6.1.jar [classpath] /var/lib/neo4j/lib/neo4j-ast-5.17.0.jar [classpath] /var/lib/neo4j/lib/metrics-spi-2.22.11.jar [classpath] /var/lib/neo4j/lib/shiro-lang-1.13.0.jar [classpath] /var/lib/neo4j/lib/cypher-shell-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-transport-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/utils-2.22.11.jar [classpath] /var/lib/neo4j/lib/grpc-core-1.60.1.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-ir-5.17.0.jar [classpath] /var/lib/neo4j/lib/shiro-crypto-cipher-1.13.0.jar [classpath] /var/lib/neo4j/lib/simpleclient_httpserver-0.16.0.jar [classpath] /var/lib/neo4j/lib/aws-xml-protocol-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-expressions-5.17.0.jar [classpath] /var/lib/neo4j/lib/jetty-util-10.0.17.jar [classpath] /var/lib/neo4j/lib/jsr305-3.0.2.jar [classpath] /var/lib/neo4j/lib/cypher-ast-factory-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-lucene-index-5.17.0.jar [classpath] /var/lib/neo4j/lib/jetty-server-10.0.17.jar [classpath] /var/lib/neo4j/lib/neo4j-server-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-common-5.17.0.jar [classpath] /var/lib/neo4j/lib/jakarta.activation-api-1.2.2.jar [classpath] /var/lib/neo4j/lib/neo4j-data-collector-5.17.0.jar [classpath] /var/lib/neo4j/lib/akka-remote_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/arrow-memory-netty-14.0.2.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-slotted-runtime-5.17.0.jar [classpath] /var/lib/neo4j/lib/jettison-1.5.4.jar [classpath] /var/lib/neo4j/lib/netty-resolver-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/neo4j-protocol-dbms-5.17.0.jar [classpath] /var/lib/neo4j/lib/jakarta.inject-2.6.1.jar [classpath] /var/lib/neo4j/lib/netty-nio-client-2.22.11.jar [classpath] /var/lib/neo4j/lib/jackson-databind-2.16.1.jar [classpath] /var/lib/neo4j/lib/slf4j-api-2.0.9.jar [classpath] /var/lib/neo4j/lib/netty-transport-native-epoll-4.1.101.Final-linux-aarch_64.jar [classpath] /var/lib/neo4j/lib/neo4j-seed-providers-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-codegen-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-kernel-5.17.0.jar [classpath] /var/lib/neo4j/lib/perfmark-api-0.26.0.jar [classpath] /var/lib/neo4j/lib/identity-spi-2.22.11.jar [classpath] /var/lib/neo4j/lib/checksums-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-query-logging-5.17.0.jar [classpath] /var/lib/neo4j/lib/lucene-queryparser-9.8.0.jar [classpath] /var/lib/neo4j/lib/jline-reader-3.21.0.jar [classpath] /var/lib/neo4j/lib/akka-cluster_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/scala-java8-compat_2.13-1.0.0.jar [classpath] /var/lib/neo4j/lib/FastInfoset-1.2.16.jar [classpath] /var/lib/neo4j/lib/neo4j-unsafe-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cloud-storage-s3-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-expression-evaluator-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-layout-5.17.0.jar [classpath] /plugins/aws-java-sdk-core-1.12.136.jar [classpath] /var/lib/neo4j/lib/shiro-event-1.13.0.jar [classpath] /var/lib/neo4j/lib/arrow-vector-14.0.2.jar [classpath] /var/lib/neo4j/lib/neo4j-rewriting-5.17.0.jar [classpath] /var/lib/neo4j/lib/grpc-util-1.60.1.jar [classpath] /var/lib/neo4j/lib/jetty-alpn-java-server-10.0.17.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-macros-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-graph-algo-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-fabric-5.17.0.jar [classpath] /var/lib/neo4j/lib/simpleclient_tracer_otel-0.16.0.jar [classpath] /var/lib/neo4j/lib/neo4j-security-5.17.0.jar [classpath] /var/lib/neo4j/lib/scala-reflect-2.13.11.jar [classpath] /var/lib/neo4j/lib/neo4j-arrow-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-planner-spi-5.17.0.jar [classpath] /var/lib/neo4j/lib/http2-hpack-10.0.17.jar [classpath] /var/lib/neo4j/lib/jakarta.annotation-api-1.3.5.jar [classpath] /var/lib/neo4j/lib/neo4j-fulltext-index-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-config-5.17.0.jar [classpath] /var/lib/neo4j/lib/simpleclient_common-0.16.0.jar [classpath] /var/lib/neo4j/lib/jetty-client-10.0.17.jar [classpath] /var/lib/neo4j/lib/auth-2.22.11.jar [classpath] /var/lib/neo4j/lib/metrics-graphite-4.2.23.jar [classpath] /var/lib/neo4j/lib/neo4j-cypher-compiled-expressions-5.17.0.jar [classpath] /var/lib/neo4j/lib/netty-codec-socks-4.1.101.Final.jar [classpath] /var/lib/neo4j/lib/neo4j-consistency-check-5.17.0.jar [classpath] /var/lib/neo4j/lib/jackson-jaxrs-json-provider-2.16.1.jar [classpath] /var/lib/neo4j/lib/neo4j-io-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-auth-plugin-api-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-dbms-5.17.0.jar [classpath] /var/lib/neo4j/lib/annotations-2.22.11.jar [classpath] /var/lib/neo4j/lib/arns-2.22.11.jar [classpath] /var/lib/neo4j/lib/ssl-config-core_2.13-0.4.3.jar [classpath] /var/lib/neo4j/lib/neo4j-enterprise-cypher-5.17.0.jar [classpath] /var/lib/neo4j/lib/commons-io-2.15.1.jar [classpath] /var/lib/neo4j/lib/apache-client-2.22.11.jar [classpath] /var/lib/neo4j/lib/neo4j-capabilities-5.17.0.jar [classpath] /var/lib/neo4j/lib/neo4j-protocol-catchup-5.17.0.jar [classpath] /var/lib/neo4j/lib/log4j-layout-template-json-2.20.0.jar [classpath] /var/lib/neo4j/lib/annotations-5.17.0.jar [classpath] /var/lib/neo4j/lib/http-client-spi-2.22.11.jar [classpath] /var/lib/neo4j/lib/jersey-server-2.34.jar [classpath] /var/lib/neo4j/lib/jose4j-0.9.4.jar [classpath] /var/lib/neo4j/lib/neo4j-raft-5.17.0.jar [classpath] /var/lib/neo4j/lib/akka-actor_2.13-2.6.19.jar [classpath] /var/lib/neo4j/lib/neo4j-push-to-cloud-5.17.0.jar [classpath] /plugins/apoc.jar -------------------------------------------------------------------------------- [ Library path ] -------------------------------------------------------------------------------- /usr/java/packages/lib /usr/lib64 /lib64 /lib /usr/lib -------------------------------------------------------------------------------- [ System properties ] -------------------------------------------------------------------------------- NEO4J_CONF = /var/lib/neo4j/conf sun.jnu.encoding = ANSI_X3.4-1968 sun.arch.data.model = 64 user.timezone = Etc/UTC sun.java.launcher = SUN_STANDARD user.country = US sun.boot.library.path = /opt/java/openjdk/lib sun.java.command = com.neo4j.server.enterprise.EnterpriseEntryPoint --home-dir=/var/lib/neo4j --config-dir=/var/lib/neo4j/conf --console-mode jdk.debug = release sun.cpu.endian = little user.home = /var/lib/neo4j user.language = en file.separator = / sun.management.compiler = HotSpot 64-Bit Tiered Compilers user.name = neo4j path.separator = : file.encoding = UTF-8 jnidispatch.path = /var/lib/neo4j/.cache/JNA/temp/jna17727024796111676824.tmp org.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions = true jna.platform.library.path = /usr/lib/x86_64-linux-gnu:/lib/x86_64-linux-gnu:/lib64:/usr/lib:/lib jna.loaded = true jetty.git.hash = af15f12297adf5c5083e1f2f8f4c5974438bca25 user.dir = /var/lib/neo4j native.encoding = ANSI_X3.4-1968 sun.io.unicode.encoding = UnicodeLittle -------------------------------------------------------------------------------- [ (IANA) TimeZone database version ] -------------------------------------------------------------------------------- TimeZone version: 2023c (available for 603 zone identifiers) -------------------------------------------------------------------------------- [ Network information ] -------------------------------------------------------------------------------- Interface eth0: address: 172.17.0.2 Interface lo: address: 0:0:0:0:0:0:0:1%lo address: 127.0.0.1 -------------------------------------------------------------------------------- [ Native access information ] -------------------------------------------------------------------------------- Native access details: Linux native access is available. -------------------------------------------------------------------------------- [ DBMS config ] -------------------------------------------------------------------------------- DBMS provided settings: db.tx_log.rotation.retention_policy=3 days 5G dbms.max_databases=1000 dbms.security.procedures.unrestricted=apoc.* internal.dbms.block_size.array_properties=120 internal.dbms.block_size.labels=56 internal.dbms.block_size.strings=120 server.cluster.advertised_address=bbb771329c30:6000 server.cluster.raft.advertised_address=bbb771329c30:7000 server.default_listen_address=0.0.0.0 server.directories.logs=/logs server.directories.neo4j_home=/var/lib/neo4j server.directories.plugins=/plugins server.discovery.advertised_address=bbb771329c30:5000 server.jvm.additional=-XX:+ExitOnOutOfMemoryError --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED -Dorg.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions=true server.memory.heap.initial_size=3.42GiB server.memory.heap.max_size=3.42GiB server.memory.pagecache.size=1.76GiB Directories in use: dbms.kubernetes.namespace=/var/run/secrets/kubernetes.io/serviceaccount/namespace dbms.kubernetes.token=/var/run/secrets/kubernetes.io/serviceaccount/token server.directories.cluster_state=/var/lib/neo4j/data/cluster-state server.directories.data=/var/lib/neo4j/data server.directories.dumps.root=/var/lib/neo4j/data/dumps server.directories.import=/var/lib/neo4j server.directories.lib=/var/lib/neo4j/lib server.directories.licenses=/var/lib/neo4j/licenses server.directories.logs=/logs server.directories.metrics=/var/lib/neo4j/metrics server.directories.neo4j_home=/var/lib/neo4j server.directories.plugins=/plugins server.directories.run=/var/lib/neo4j/run server.directories.script.root=/var/lib/neo4j/data/scripts server.directories.transaction.logs.root=/var/lib/neo4j/data/transactions ```

The Vm arguments look okay from what I can tell:

VM Arguments: [-XX:+ExitOnOutOfMemoryError, --add-opens=java.base/java.nio=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, -Dorg.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions=true, -Dfile.encoding=UTF-8, -Xms3584000k, -Xmx3584000k]

Additionally, do you think it would be possible to try to enable some additional debug information? We have a flag that enables extra prints for reflection issues. All you need to do is add the following to the config and restart. And then check if exceptions are being printed in the log server.jvm.additional=-Dorg.neo4j.io.pagecache.impl.SingleFilePageSwapper.printReflectionExceptions=true

The debug.log doesn't contain any new errors or exception yet, but they usually come at night when we load data from 3rd party systems into Neo4j. I will follow up with new data tomorrow.

I also am a bit unsure what is happening with the deletion. Indeed the policy should have deleted the older file. Can you confirm that a checkpoint triggered after the second file was created? (log files are pruned on checkpointing).

I checked the transaction log when the database fashionbeauty19qnngn was restarted:

[fashionbeauty19qnngn/0a214b04] --------------------------------------------------------------------------------
[fashionbeauty19qnngn/0a214b04]                               [ Transaction log ]
[fashionbeauty19qnngn/0a214b04] --------------------------------------------------------------------------------
[fashionbeauty19qnngn/0a214b04] Transaction log files stored on file store:  device - `Volume          `, type - `Unknown', max io - '1.250MiB'.
[fashionbeauty19qnngn/0a214b04] Transaction log metadata:
[fashionbeauty19qnngn/0a214b04]  - current kernel version used in transactions: V5_15
[fashionbeauty19qnngn/0a214b04]  - last committed transaction id: 37220
[fashionbeauty19qnngn/0a214b04] Transaction log files:
[fashionbeauty19qnngn/0a214b04]  - existing transaction log versions: 13-14
[fashionbeauty19qnngn/0a214b04]  - oldest transaction 32992 found in log with version 13
[fashionbeauty19qnngn/0a214b04]  - files: (filename : creation date - size)
[fashionbeauty19qnngn/0a214b04]      neostore.transaction.db.13: 2024-07-23 16:23:59.289+0000 - 256.2MiB
[fashionbeauty19qnngn/0a214b04]      neostore.transaction.db.14: 2024-07-28 00:16:04.420+0000 - 191.9MiB
[fashionbeauty19qnngn/0a214b04]  - total size of files: 448.1MiB
[fashionbeauty19qnngn/0a214b04] Checkpoint log files:
[fashionbeauty19qnngn/0a214b04]  - existing checkpoint log versions: 0-0
[fashionbeauty19qnngn/0a214b04]  - last checkpoint: CheckpointInfo[transactionLogPosition=LogPosition{logVersion=14, byteOffset=201228573}, storeId=StoreId{creationTime=1721741017484, random=-8796547720763615136, storageEngineName='record', formatName='ali>
[fashionbeauty19qnngn/0a214b04] 

It shows the two transaction log files from the file system. Then I created a lot of dummy changes to force Neo4j to create a new log file neostore.transaction.db.15. Shortly after the checkpoint was triggered to prune old log files, but nothing was removed:

2024-08-05 14:04:44.233+0000 INFO  [o.n.k.d.Database] [fashionbeauty19qnngn/0a214b04] Rotated to transaction log [/data/transactions/fashionbeauty19qnngn/neostore.transaction.db.15] version=14, last transaction in previous log=37227, rotation took 48 millis.
2024-08-05 14:06:14.016+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] [fashionbeauty19qnngn/0a214b04] Checkpoint triggered by "Scheduled checkpoint for every 15 minutes threshold" @ txId: 37227 checkpoint started...
2024-08-05 14:06:15.825+0000 INFO  [o.n.k.i.t.l.c.CheckPointerImpl] [fashionbeauty19qnngn/0a214b04] Checkpoint triggered by "Scheduled checkpoint for every 15 minutes threshold" @ txId: 37227 checkpoint completed in 1s 808ms. Checkpoint flushed 2033 pages (0% of total available pages), in 2018 IOs. Checkpoint performed with IO limit: 600, paused in total 14 times( 1238 millis).
2024-08-05 14:06:15.850+0000 INFO  [o.n.k.i.t.l.p.LogPruningImpl] [fashionbeauty19qnngn/0a214b04] No log version pruned. The strategy used was '3 days 5368709120 size'. 

The file system still shows the old file neostore.transaction.db.13:

/neo4j/data/transactions/fashionbeauty19qnngn: ls -al
total 792660
drwxr-xr-x  2 7474 7474      4096 Aug  5 14:04 .
drwxr-xr-x 25 7474 7474      4096 Aug  5 09:51 ..
-rw-r--r--  1 7474 7474      6392 Aug  5 14:06 checkpoint.0
-rw-r--r--  1 7474 7474 268646770 Jul 23 16:23 neostore.transaction.db.13
-rw-r--r--  1 7474 7474 274574039 Aug  5 14:04 neostore.transaction.db.14
-rw-r--r--  1 7474 7474 268435456 Aug  5 14:04 neostore.transaction.db.15

Could it be (assumption) that checkpoint.0 acts as kind of a state (like a pointer on a timeline) for Neo4j, which it updates every time it prunes the log files. If so, perhaps the older transaction log files were tried to be removed, but for some reason it failed. Then, Neo4j didn't handle this case properly and moved the checkpoint.0 forward in time. Now it's the case that checkpoint.0 is in the future compared to the old neostore.transaction.db.13 file, so it is not in scope by the transaction log file pruning algorithm?

Last thing - I am not sure how you mount volumes in Docker, but maybe there is something around how Docker mounts disks or some permissions that affects this? (a bit of a shot in the dark)

I will look into this!

neo-ionut commented 1 month ago

In regards to the checkpointing and log pruning - in 5.10 we have changed what we use as reference for when a file should be deleted. After that version, we use the first timestamp from the following version as approximation of the timestamp of the last entry in the current log version. The reason for this was that sometimes we've seen situations where a log is created but there is no activity for a while. Because of this there were some edge cases where incremental backup would fail.

So it's not the creation timestamp of the file that you should check, but the first transaction in the next one :) I hope this makes a bit of sense.

zirkelc commented 1 month ago

It looks like there weren't any issues for Incorrect file descriptor tonight. Maybe the three JVM args --add-opens=java.base/* did solve it?

However, I noticed two other error yesterday:

2024-08-05 14:20:03.445+0000 ERROR [o.n.k.i.u.w.DefaultFileDeletionEventListener] '.checkpoint.0.swp' which belongs to the 'fashionbeauty19qnngn' database was deleted while it was running.
2024-08-05 16:58:48.764+0000 ERROR [o.n.k.i.u.w.DefaultFileDeletionEventListener] '.neostore.transaction.db.13.swp' which belongs to the 'fashionbeauty19qnngn' database was deleted while it was running.

Guessing from the timestamp, it looks like I opened these files with the editor and created a swap file which was deleted automatically afterwards.

So it's not the creation timestamp of the file that you should check, but the first transaction in the next one :) I hope this makes a bit of sense. Does this mean the oldest transactions file neostore.transaction.db.13 from July 23 could contain transactions from the last 3 days (retention policy = 3 days 5G)? In which encoding or format are these files encoded? I opened it with nano but it mostly shows gibberish.

neo-ionut commented 1 month ago

It looks like there weren't any issues for Incorrect file descriptor tonight. Maybe the three JVM args --add-opens=java.base/* did solve it?

That is great to hear, I do think it did solve it.

However, I noticed two other error yesterday:

Yes, we are watching the folder and issuing warning if we see anything disappearing. I do think you might have created those files and, as such, those warning can be ignored.

Does this mean the oldest transactions file neostore.transaction.db.13 from July 23 could contain transactions from the last 3 days (retention policy = 3 days 5G)?

Not really. It means we use the FIRST timestamp from 14 as the "prune date" for 13. Why? Because it's faster to do that then use the last one from 13 instead. (as we would need to parse the whole file to get to that).

In which encoding or format are these files encoded?

These are binary files. We don't officially document their format (or support it), but the code is available part of our community edition, so it should be easy-ish to figure it out from there.

zirkelc commented 1 month ago

The logs are not showing any warnings or error for the last couple of days, so I guess the VM arguments were the missing piece. Maybe it would make sense to add the three arguments from this article to the official docs? The article mentions it affects Neo4j before v5.14, but I'm running v5.17 already.