bytedance / terarkdb

A RocksDB compatible KV storage engine with better performance
Apache License 2.0
2.04k stars 203 forks source link

How to make JNI.jar to make the terarkdb usable for flink? #235

Closed zhougit86 closed 2 years ago

zhougit86 commented 2 years ago

[Enhancement]

Problem

is there any instruc\tion for How to make JNI.jar to make the terarkdb usable for flink?

Solution

yapple commented 2 years ago

https://github.com/yapple/terarkdb/tree/flink-terark-1.0

yapple commented 2 years ago

we do some work about terarkdb for Flink , and get obvious benefit in some big backend status using KV separation you can get this work in branch of flink-terark-1.0 most of the instruction is similar to RocksDB

zhougit86 commented 2 years ago

感谢大佬回复,编译jni包的教程有吗

yapple commented 2 years ago

git clone https://github.com/yapple/terarkdb.git -b flink-terark-1.0 cd flink-terark-1.0 && make fterark -j10 most of the other instruction is similar to RocksDB

zhougit86 commented 2 years ago

thx, is it possible to give wechat? We are from a Shanghai e-commerce company, Hope to get some more insight into terark db

yapple commented 2 years ago

you can list some pain points when using Flink, such as average value size, write amplification , or the CPU usage problem

yapple commented 2 years ago

any problem we can talk in Github or join in our Slack

zhougit86 commented 2 years ago

Yes, actually we found some issue when we are using rocksdb to flink statebackend. The situation is the read/ iterating operation of the JNI occupies like 90% in the flame graph, but the cpu utilization is not high and the disc through put is not high either. when we reduce the level target size, it can solve the problem a little bit. Is there any suggestion to debug this kind of issue?

zhougit86 commented 2 years ago

Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_181] at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:216) ~[?:1.8.0_181] at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[?:1.8.0_181] at java.nio.channels.Channels.writeFully(Channels.java:101) ~[?:1.8.0_181] at java.nio.channels.Channels.access$000(Channels.java:61) ~[?:1.8.0_181] at java.nio.channels.Channels$1.write(Channels.java:174) ~[?:1.8.0_181] at java.nio.file.Files.copy(Files.java:2909) ~[?:1.8.0_181] at java.nio.file.Files.copy(Files.java:3027) ~[?:1.8.0_181] at org.rocksdb.NativeLibraryLoader.loadLibraryFromJarToTemp(NativeLibraryLoader.java:113) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT] at org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT] at org.rocksdb.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:56) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT] at org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.ensureRocksDBIsLoaded(EmbeddedRocksDBStateBackend.java:860) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT]

zhougit86 commented 2 years ago

Caused by: java.nio.channels.ClosedByInterruptException at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202) ~[?:1.8.0_181] at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:216) ~[?:1.8.0_181] at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[?:1.8.0_181] at java.nio.channels.Channels.writeFully(Channels.java:101) ~[?:1.8.0_181] at java.nio.channels.Channels.access$000(Channels.java:61) ~[?:1.8.0_181] at java.nio.channels.Channels$1.write(Channels.java:174) ~[?:1.8.0_181] at java.nio.file.Files.copy(Files.java:2909) ~[?:1.8.0_181] at java.nio.file.Files.copy(Files.java:3027) ~[?:1.8.0_181] at org.rocksdb.NativeLibraryLoader.loadLibraryFromJarToTemp(NativeLibraryLoader.java:113) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT] at org.rocksdb.NativeLibraryLoader.loadLibraryFromJar(NativeLibraryLoader.java:78) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT] at org.rocksdb.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:56) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT] at org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.ensureRocksDBIsLoaded(EmbeddedRocksDBStateBackend.java:860) ~[flink-dist_2.11-du-1.13-SNAPSHOT.jar:du-1.13-SNAPSHOT]

Have you met this problem, when flink tries to copy the .so file to /tmp?

yapple commented 2 years ago

Yes, actually we found some issue when we are using rocksdb to flink statebackend. The situation is the read/ iterating operation of the JNI occupies like 90% in the flame graph, but the cpu utilization is not high and the disc through put is not high either. when we reduce the level target size, it can solve the problem a little bit. Is there any suggestion to debug this kind of issue?

  1. can you add your flame graph and your rocksdb LOG
  2. i think you can focus on the read write throughput, latency, to analysis the real bottleneck of your system
zhougit86 commented 2 years ago

Yes, actually we found some issue when we are using rocksdb to flink statebackend. The situation is the read/ iterating operation of the JNI occupies like 90% in the flame graph, but the cpu utilization is not high and the disc through put is not high either. when we reduce the level target size, it can solve the problem a little bit. Is there any suggestion to debug this kind of issue?

  1. can you add your flame graph and your rocksdb LOG
  2. i think you can focus on the read write throughput, latency, to analysis the real bottleneck of your system

thanks for replay, finally we found flink turn off the pin high priority buffer to result in these issues.

zhougit86 commented 2 years ago

One more thing, does byte dance use the terak db as flink state backend?

yapple commented 2 years ago

Yes, this month , we publish a release version to flink

rovboyko commented 1 year ago

Hi, @yapple !

Could you please advise me - where can I get the terarkdbjni.jar ? Or how can I build it?

I tried to build it from sources (from https://github.com/bytedance/terarkdb/tree/dev.1.4 and from https://github.com/yapple/terarkdb/tree/flink-terark-1.0). But everytime I got a lot of compilation errors.

I just want to benchmark terarkDB for Flink inside our infrastructure. And it would be great if you already have any benchmark results for comparison?

zhougit86 commented 1 year ago

@yapple 大佬方便留个飞书交流下吗?想跟你请教一下terark的细节

zhougit86 commented 1 year ago

@yapple 大佬,我发现两个问题: terark 跑一些小任务的波动要大于一般的rocksdb terark的目录下面的LOG里面没有内容

yapple commented 1 year ago

You can open a new issue to discuss what you found, you should add more contextual information about the issue so we can understand it. we have fixed some minor problem in delete intensive situation, if you have the same problem please let me know

Please make sure to compare with RocksDB in medium and large KV scenarios

yapple commented 1 year ago

@rovboyko >

Sorry for the late reply, is the problem solved now?

"And it would be great if you already have any benchmark results for comparison?" YES , we have shared our work in FFA2023, and we shared our benchmark results, in average , we decrease the average cpu cost 30%