Open chris-allan opened 1 year ago
Hi @chris-allan, thanks for opening the story and providing a reproduction. We are investigating.
Thanks, @ihnorton.
Just an FYI for the team that I've been running the test case a few more times trying to pick apart what might be going wrong and so far about half the time things block on Query.submit()
for a very long time. So you may or may not get those native code exceptions and a clean exit depending on how patient you are.
At time of original submission we were at chris-allan/tiledb-java-torture@d2e71cb. I've just pushed chris-allan/tiledb-java-torture@009b39c which does more logging and has explanations for the intent of each instance variable configuration options. While making these improvements and doing a few runs I noticed, just once, what appears to be a non-fatal log line being printed out during execution:
...
Read rectangle: [0, 1000]; status: TILEDB_COMPLETED
Read rectangle: [1000, 1000]; status: TILEDB_COMPLETED
[2023-07-28 11:47:11.266] [Process: 370504] [error] [1690541212777298500-Global] [TileDB::Task] Error: Caught std::exception: device or resource busy: device or resource busy
[2023-07-28 11:47:11.266] [Process: 370504] [error] [1690541212777298500-Global] C API: TileDB Internal, std::exception; device or resource busy: device or resource busy
Read rectangle: [2000, 1000]; status: TILEDB_COMPLETED
Read rectangle: [4000, 1000]; status: TILEDB_COMPLETED
...
This occurred when processing a single channel.
Hi @chris-allan, updating that we are still actively working on this -- @DimitrisStaratzis has reproduced and is working through with @KiterLuc.
Thanks, @ihnorton. I'll be out of the office for a couple weeks starting Monday but will try to keep an eye on things and answer any questions that come up as quickly as I can.
Hi @chris-allan, we've identified the root cause of your issue, and we've prepared a pull request to address it. We anticipate being able to merge the solution within the next couple of days. We'll keep you updated on our progress.
Hi @chris-allan, this release includes the fix: https://github.com/TileDB-Inc/TileDB/releases/tag/2.16.3. Let us know if it works for you. Thanks, Luc.
TileDB-Java 0.18.0 has been released to support the latest core version.
Thanks to everyone for all the work that went into TileDB-Inc/TileDB#4256! We've successfully managed to upgrade our codebase to 0.18.0 but have noticed a ~10x performance degradation vs. 0.10.1 in our production codebase during the copy phase. This results in a job which takes ~18 minutes with 0.10.1 to take over 3 hours with 0.18.0. It seems to be localized particularly to invoking Query.close()
which can take longer than the read or write itself in some cases which is particularly noticeable where we have operations that have to complete serially.
To try and illustrate this for everyone I have updated the torture example with logging and some stop watches where we are seeing issues as of chris-allan/tiledb-java-torture@4211931. You will need to switch debug
to true
if you want the timings. Here is summary run on Linux against an NVMe SSD:
callan@behemoth:~/code/tiledb-java-torture$ java -jar build/install/*/lib/perf4j-0.9.16.jar debug.log
Performance Statistics 2023-08-23 11:50:00 - 2023-08-23 11:50:30
Tag Avg(ms) Min Max Std Dev Count
Query.close() 0.1 0 1 0.2 1373
Subarray.close() 0.0 0 0 0.0 1373
TileDB.writeBlock() 45.2 11 185 37.1 1374
Performance Statistics 2023-08-23 11:50:30 - 2023-08-23 11:51:00
Tag Avg(ms) Min Max Std Dev Count
Query.close() 1.3 0 89 3.6 3905
Subarray.close() 0.0 0 0 0.0 3904
TileDB.writeBlock() 24.3 11 112 4.7 3904
Performance Statistics 2023-08-23 11:51:00 - 2023-08-23 11:51:30
Tag Avg(ms) Min Max Std Dev Count
DestinationQuery.close() 2.3 0 28 1.9 1540
DestinationSubarray.close() 0.0 0 1 0.0 1540
Query.close() 0.5 0 2 0.6 442
SourceQuery.close() 5.8 0 97 5.8 1540
SourceSubarray.close() 0.0 0 1 0.0 1540
Subarray.close() 0.0 0 0 0.0 443
TileDB.readBlock() 13.2 4 25 4.4 1540
TileDB.writeBlock() 21.6 9 62 5.9 1982
Comparing like with like is hard due to the subarray API differences between 0.10.1 and 0.18.0 but I don't think I can recall a time with 0.10.1 where we ever saw native memory cleanup of Query
be millisecond measurable.
I can open another issue if that's preferred. Happy to provide any further information that's helpful.
Thanks for the update @chris-allan. We will start our investigation and see where the time is being spent.
Hi @chris-allan, could you please give us some more details about the system you are using?
(^ especially the number of CPUs)
Ubuntu 22.04. 64GB of DDR4. NVMe.
callan@behemoth:~$ lscpu
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Address sizes: 48 bits physical, 48 bits virtual
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Vendor ID: AuthenticAMD
Model name: AMD Ryzen 9 5950X 16-Core Processor
CPU family: 25
Model: 33
Thread(s) per core: 2
Core(s) per socket: 16
Socket(s): 1
Stepping: 2
Frequency boost: enabled
CPU max MHz: 5083.3979
CPU min MHz: 2200.0000
BogoMIPS: 6787.90
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf rapl pni pclmulqdq monitor ssse3 fma cx16 ss
e4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate ssbd mba ibrs ibpb s
tibp vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save
tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif v_spec_ctrl umip pku ospke vaes vpclmulqdq rdpid overflow_recov succor smca fsrm
Virtualization features:
Virtualization: AMD-V
Caches (sum of all):
L1d: 512 KiB (16 instances)
L1i: 512 KiB (16 instances)
L2: 8 MiB (16 instances)
L3: 64 MiB (2 instances)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-31
Vulnerabilities:
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Spectre v2: Mitigation; Retpolines, IBPB conditional, IBRS_FW, STIBP always-on, RSB filling
Srbds: Not affected
Tsx async abort: Not affected
Hi @chris-allan ,
We've attempted to replicate the issue but so far, we haven't been able to reproduce such a slowdown across versions. We have been conducting tests using your code under various scenarios and with datasets of different sizes.
Here is one representative experiment. (These statistics are for the entire runtime not for ranges like yours)
Tag | Avg(ms) | Min | Max | Std Dev | Count |
---|---|---|---|---|---|
SourceQuery.close() | 29.8 | 9 | 83 | 7.5 | 13,200 |
TileDB.readBlock() | 88.4 | 48 | 139 | 9.9 | 13,200 |
DestinationQuery.close() | 17.9 | 2 | 50 | 8.6 | 13,200 |
Query.close() | 5.1 | 0 | 363 | 9.6 | 51,480 |
TileDB.writeBlock() | 23.2 | 13 | 387 | 6.1 | 64,680 |
Tag | Avg(ms) | Min | Max | Std Dev | Count |
---|---|---|---|---|---|
SourceSubarray.close() | 0.0 | 0 | 12 | 0.4 | 13,200 |
SourceQuery.close() | 32.1 | 16 | 217 | 6.9 | 13,200 |
TileDB.readBlock() | 109.8 | 59 | 451 | 12.7 | 13,200 |
DestinationSubarray.close() | 0.0 | 0 | 8 | 0.3 | 13,200 |
DestinationQuery.close() | 18.8 | 4 | 48 | 7.6 | 13,200 |
Query.close() | 3.8 | 0 | 279 | 7.9 | 51,480 |
TileDB.writeBlock() | 21.7 | 11 | 378 | 6.1 | 64,680 |
Subarray.close() | 0.0 | 0 | 0 | 0.0 | 51,480 |
Moreover, we've created a modified version of your code that seamlessly operates with TileDB-Java 0.10.1, and the runtime remains very similar. Our stopwatches also confirm that the Query.close()
method consistently takes a few ms in both versions.
It's important to note that all of our experiments were conducted on an Ubuntu EC2 instance with the exact specifications you provided.
To further assist us in addressing this matter, could you share some additional information to help us reproduce the issue? (e.g, how many workers you have in production?)
We are happy to get on a call to discuss in more detail.
Best, Dimitris
Thanks to all for the quick turnaround resolving #301.
Unfortunately we've hit a much deeper snag performing an upgrade starting with 0.11.0. Nearly as soon as we re-open in read mode an Array we had previously been writing to we get native code errors or what appears to be a deadlock. This only happens after writing many, many overlapping chunks. This does not happen with 0.10.1.
With our production code this is the behaviour:
TileDB-Java 0.11.0 (TileDB 2.9.0)
TileDB-Java 0.13.0 (TileDB 2.11.0)
TileDB-Java 0.14.1 (TileDB 2.12)
TileDB-Java 0.15.2 (TileDB 2.13.2)
Hang or deadlock. Worker stack traces (collected via jstack) are:
TileDB-Java 0.16.1 (TileDB 2.14.1)
Works for a while, dies later.
TileDB-Java 0.17.8 (TileDB 2.15.4)
I've put together a limited example which reproduces this:
It fails like this:
That above output snippet from Windows 10. Linux behaves similarly but not identically.
The code reflects the pattern from our production code that relies on TileDB fairly well:
20 channels is about right to produce the errors; ~11000 fragments. If less data is processed, things proceed as normal. The issue occurs with or without consolidation.