Closed Github-Polaris closed 7 months ago
The core of HugeGraph's designation as a graph database lies not in its storage mechanism but in how it models, stores, and queries data. HugeGraph uses a graph data model, representing data as vertices (nodes) and edges (relationships), which fundamentally differs from the tabular model of relational databases. This model facilitates direct and efficient querying of relationships and patterns within the data, leveraging graph-specific operations and algorithms.
HugeGraph supports various backends like MySQL, RocksDB, Cassandra, and HBase, not as a limitation to relational models but as a flexible foundation for storing graph data. On top of these storage systems, HugeGraph implements graph-specific features such as schema management, indexing strategies for efficient query processing, and graph traversal optimizations. These features enable HugeGraph to provide the advantages of graph databases, such as improved search speeds and efficient data relationship exploration, regardless of the underlying storage system.
The use of traditional databases for storage allows HugeGraph to leverage their robustness, scalability, and persistence capabilities while abstracting their complexities away from the user. This abstraction is achieved through a unified graph processing interface that translates graph operations into database-specific queries and updates, ensuring that the graph model's benefits are fully realized.
In essence, HugeGraph's ability to serve as a true graph database is derived from its implementation of graph-specific operations, optimizations, and its support for graph query languages, rather than the nature of its storage backend. This approach allows HugeGraph to offer the unique advantages of graph databases, including efficient handling of complex queries and relationships, which are central to the graph data model.
Simply put, in addition to the content mentioned by @dosu-bot, HugeGraph's main storage design relies on the binary KV structure in disk & graph structure in memory, but is compatible with relational databases such as MySQL, making it easier for ordinary users to understand/migrate to the graph. (It should be noted that HG does not rely on relational databases, which are not equivalent to KV storage)
From an abstract perspective, the essence of a storage system is to provide the semantics of KV/Map
, just like the data structure of a graph is essentially composed of array or linked-list. Therefore, building a graph system based on stable/reliable KV storage is a mainstream/common approach, and it is also easier to expand to distributed environments
The discussion around HugeGraph's use of relational/SQL-based backend storage systems like MySQL, RocksDB, Cassandra, and HBase for graph data storage raises important points about the nature and capabilities of graph databases. The essence of HugeGraph being classified as a graph database is not strictly about the storage mechanism it employs but rather about how it models, manages, and queries data. By representing data as vertices and edges, HugeGraph adopts a graph data model that is fundamentally different from the tabular model used in relational databases. This model is optimized for efficiently querying relationships and patterns within the data, which is a core advantage of graph databases.
HugeGraph's support for various backend systems is a strategic choice that allows it to leverage the robustness, scalability, and persistence features of these established databases. On top of these systems, HugeGraph implements graph-specific functionalities such as schema management, indexing for efficient query processing, and optimizations for graph traversal. These capabilities enable HugeGraph to offer the key benefits associated with graph databases, such as enhanced search speeds and more effective data relationship exploration, irrespective of the underlying storage system.
The clarification provided by the maintainer, @imbajin, further enriches our understanding by highlighting that HugeGraph's design is centered around a binary key-value (KV) structure on disk and a graph structure in memory. This approach is not solely dependent on relational databases but is compatible with them, facilitating easier migration or understanding for users familiar with relational models. It's important to note that the use of relational databases is not equivalent to KV storage but rather a compatible option.
From a broader perspective, the choice to build a graph system on top of stable and reliable KV storage is a common and mainstream approach in the industry. This strategy not only ensures the reliability and efficiency of the storage system but also simplifies expansion into distributed environments. The underlying KV/Map semantics provided by the storage system serve as a foundational layer upon which the graph data model is implemented, allowing HugeGraph to fully leverage the unique advantages of graph databases for handling complex queries and relationships.
Simply put, in addition to the content mentioned by @dosu-bot, HugeGraph's main storage design relies on the binary KV structure in disk & graph structure in memory, but is compatible with relational databases such as MySQL, making it easier for ordinary users to understand/migrate to the graph. (It should be noted that HG does not rely on relational databases, which are not equivalent to KV storage)
From an abstract perspective, the essence of a storage system is to provide the semantics of
KV/Map
, just like the data structure of a graph is essentially composed of array or linked-list. Therefore, building a graph system based on stable/reliable KV storage is a mainstream/common approach, and it is also easier to expand to distributed environments
Thank you! The "making it easier for ordinary users to understand/migrate to the graph. (It should be noted that HG does not rely on relational databases" which is part of you sentence answered my question.
I'm glad the explanation provided clarity on HugeGraph's approach to storage and its classification as a graph database. Understanding that HugeGraph's design is not strictly tied to relational databases but rather utilizes a binary KV structure for disk storage and a graph structure in memory helps highlight its flexibility and efficiency in handling graph data. This design choice facilitates easier migration for users familiar with relational databases while fully leveraging the advantages of graph databases for complex queries and relationship exploration. If you have any more questions or need further clarification on HugeGraph or graph databases in general, feel free to ask!
Problem Type (问题类型)
None
Before submit
Environment (环境信息)
Your Question (问题描述)
hugegraph的backend采用了mysql,rocksdb,cassandra、hbase等数据库,连最底层的数据存储都没有脱离关系型数据库,为什么可以叫Graph DB?我理解,所谓的Graph DB相当于在关系型数据库上做了一层图形化界面,把关系型数据库中的数据展示为图,底层数据的存储结构与形式仍未脱离关系型数据库,号称的检索速度等优势是如何体现的?
Vertex/Edge example (问题点 / 边数据举例)
No response
Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)
No response