[Question] When executing OLTP with a large amount of data, if the vertex ID type is String, an OOM exception will occur

LiJie20190102 commented 4 months ago

Bug Type (问题类型)

other exception / error (其他异常报错)

Before submit

[X] 我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

Server Version: 1.0.0 (Apache Release Version)
Backend: RocksDB 1 nodes, HDD
OS: 32 CPUs, xx G RAM, CentOS 7.6
Data Size: 60000000 vertices, 1800000000 edges

Expected & Actual behavior (期望与实际表现)

我的顶点id类型时string，但是顶点全是数值，当做string入库了，此时我执行oltp操作时，出现了oom，但是若顶点id类型改为Number时，此时不会出现oom。我看了之前的相应代码修改记录，之前是不是出现过类似问题，所以对数值类型单独处理：https://github.com/apache/incubator-hugegraph/pull/1623#issuecomment-1971180435

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

dosubot[bot] commented 4 months ago

🤖

The Out Of Memory (OOM) exception you're experiencing when executing OLTP with a large amount of data using a String vertex ID type, but not with a Number vertex ID type, is likely due to the way String IDs are handled compared to Number IDs in Apache HugeGraph.

The SplicingIdGenerator class in the HugeGraph codebase generates String IDs by concatenating various parts with specific separators and possibly escaping certain characters. This process can significantly increase the memory footprint when dealing with a large number of vertices, as each String ID consumes more memory than its numeric counterpart.

String IDs, especially when concatenated and escaped, are inherently larger and more complex than numeric IDs. They require more memory for storage and processing, which can lead to OOM exceptions under heavy load or with large datasets. The memory overhead comes from:

The Java object overhead for each String object.
The internal character array (char[]) used by String, which consumes more space than a simple numeric value.
Additional processing required for escaping and concatenating parts of the ID, which can increase CPU and memory usage.

To resolve this issue, consider the following approaches:

Optimize String ID Usage: If String IDs are essential, review the ID generation scheme to minimize the length and complexity of the generated IDs. Avoid unnecessary concatenations or use shorter separators and identifiers.
Switch to Numeric IDs: If possible, switch to using numeric IDs for vertices. Numeric IDs are more memory-efficient and can help avoid OOM exceptions.
Increase Memory Allocation: Increase the Java Virtual Machine (JVM) memory allocation for the HugeGraph server. This can be done by adjusting the -Xmx and -Xms parameters in the JVM startup arguments to allocate more heap space.
Optimize HugeGraph Configuration: Review and optimize HugeGraph and RocksDB configurations for better memory management. This includes tuning cache sizes, batch sizes, and other parameters related to memory usage.
Upgrade Hardware: If the dataset is expected to grow, consider upgrading the server hardware, specifically by adding more RAM or moving to a Solid State Drive (SSD) which can alleviate some of the I/O bottlenecks associated with HDDs during heavy loads.

It's important to profile and monitor the application to identify the exact cause of the OOM exception and to verify the effectiveness of the applied solutions.

For more details, you can refer to the SplicingIdGenerator class in the HugeGraph codebase here.

Sources

hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/backend/id/SplicingIdGenerator.java

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

LiJie20190102 commented 4 months ago

After my testing, I found that there are indeed more map objects created. That is to say, when I use the String type, I create more map objects; But when I switched to the Number type, there were fewer map objects.

I encountered the above issue while executing 'AllShortestPath' (org.apache.hugegraph.api.traverser.AllShortestPathsAPI#get), params: capacity:1000000 degree:10000 skipDegree:0 direction:BOTH maxDepth:3

LiJie20190102 commented 4 months ago

@imbajin

imbajin commented 4 months ago

我的顶点id类型时string，但是顶点全是数值，当做string入库了，此时我执行oltp操作时，出现了oom，但是若顶点id类型改为Number时，此时不会出现oom。我看了之前的相应代码修改记录，之前是不是出现过类似问题，所以对数值类型单独处理：

不是很清楚你的具体诉求是什么? 考虑几个:

如果你确定 id 某个类型是数值 (Number), 那么设置为String 显然是会增大它的空间占用的, 也不利于压缩/计算/排序等
哪怕是 String 类型, 但如果能确定是 UUID 这种字符串, 也应该指定为专属的 UUID 类型, 会有专属的优化和定长计算
改 Number 和 OOM 应该没有太多直接关系, 主要是性能优化考虑, 数值类型能使用的时候优先使用 (包括图计算里也常会做 ID 映射优化时空间)
当数据量的确很大 + 单台节点内存有限的时候, 可以考虑使用多 server + 原生分布式版本 #2265, 有计算下推和多点任务调度, 避免在单台server 汇聚太多数据 (1.5 发布, 可提前测试使用)

也可以召唤 dosu-bot 获取一些更多的信息和上下文, @dosu-bot 请结合新的回复更新一下建议

github-actions[bot] commented 3 months ago

Due to the lack of activity, the current issue is marked as stale and will be closed after 20 days, any update will remove the stale label

apache / incubator-hugegraph