[Question] rocksdb read write procedure of module store

sheli00 commented 5 months ago

Problem Type (问题类型)

struct / logic (架构 / 逻辑设计问题)

Before submit

[X] 我已经确认现有的 Issues 与 FAQ 中没有相同 / 重复问题 (I have confirmed and searched that there are no similar problems in the historical issue and documents)

Environment (环境信息)

Server Version: the latest commit on pd-store branch
Backend: hstore

Your Question (问题描述)

When I send a data storage request to the org.apache.hugegraph.server, can you tell me how the org.apache.hugegraph.store module receives the request and then executes the storage command? It would be better if you could provide a reference process/sequence diagram/call chain.

Vertex/Edge example (问题点 / 边数据举例)

No response

Schema [VertexLabel, EdgeLabel, IndexLabel] (元数据结构)

No response

dosubot[bot] commented 5 months ago

嘿，@sheli00！很高兴再次见到你。希望你一切顺利！

The process of how the org.apache.hugegraph.store module receives a data storage request from the org.apache.hugegraph.server and executes the storage command involves several steps and components, primarily through gRPC services.

The HgStoreSessionImpl class is annotated with @GRpcService, making it a gRPC service endpoint. This allows it to receive RPC calls from clients, which in this case could be the org.apache.hugegraph.server. For each type of request (e.g., get2, clean, batchGet2, batch, table), there is a corresponding method in HgStoreSessionImpl that handles the request. These methods are overrides of the methods defined in the generated gRPC service base class (HgStoreSessionGrpc.HgStoreSessionImplBase).

Each method processes the incoming request, which includes extracting information from the request object, such as graph name, table name, keys, etc. It then performs operations by calling methods on HgStoreWrapperEx or other components, which interact with the underlying storage engine (e.g., RocksDB) or perform operations like cleaning data, batch reading/writing, etc.

For operations that modify data (clean, batch, table), the code demonstrates the use of a Raft consensus protocol to ensure consistency across distributed nodes. This involves adding tasks to a Raft queue (storeService.addRaftTask(...)) and waiting for the operation to complete.

After processing the request, each method constructs a response object (FeedbackRes) indicating the status of the operation (success, failure, etc.) and any requested data. This response is then sent back to the client through the StreamObserver parameter.

The code uses synchronization blocks and atomic variables (e.g., AtomicInteger) to ensure thread safety, particularly in methods that might be accessed by multiple threads concurrently.

The methods include error handling logic, catching exceptions, logging errors, and setting appropriate error messages in the response.

While the provided code snippet does not explicitly show the interaction between org.apache.hugegraph.server and org.apache.hugegraph.store, it demonstrates how the HgStoreSessionImpl class, part of the org.apache.hugegraph.store module, defines endpoints for handling different types of storage requests via gRPC. The actual call to these endpoints from org.apache.hugegraph.server would be done using gRPC client stubs that target these service methods, passing in the required request objects and handling the responses.

You can refer to the source code of these classes in the HugeGraph repository:

HgStoreSessionImpl.java

The org.apache.hugegraph.server module sends a data storage request to the org.apache.hugegraph.store module through the HugeGraph class's methods for creating and dropping graphs, among other operations. Specifically, the createGraph and dropGraph methods in the GraphManager class are directly involved in handling graph storage requests.

Creating a Graph: The createGraph method is used for creating a new graph instance. It calls GraphFactory.open(config), which is a critical point where the graph instance is created and subsequently initialized with graph.create(this.graphsDir, this.globalNodeRoleInfo);. This process involves configuring the graph with the provided HugeConfig object, which includes storage-related configurations. The actual storage request is handled by the HugeGraph implementation when the graph is created and initialized.
Dropping a Graph: The dropGraph method is responsible for dropping a graph, which includes clearing data and configuration files related to the graph. The graph.drop(); call within this method triggers the actual deletion of the graph data from the storage.

These methods demonstrate how the org.apache.hugegraph.server module communicates with the org.apache.hugegraph.store module to manage graph data storage. The HugeGraph class, which is part of the org.apache.hugegraph package, plays a central role in this process, acting as the bridge between the server module and the storage module.

You can refer to the source code of these classes in the HugeGraph repository:

GraphManager.java

I hope this helps! If you have any more questions or need further clarification, feel free to ask.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

sheli00 commented 5 months ago

@dosu-bot. Another question about rocksdb write procedure. On branch master, I change org.apache.hugegraph.unit.rocksdb.RocksDBSessionTest like below

    @Test
    public void testInsert() {
        for (int i = 0; i < 5000000; i++) {
            testPutAndGet();
        }
    }
    @Test
    public void testPutAndGet() {
        String value = getString(this.rocks.session().get(TABLE, getBytes("person:1gname")));

        this.rocks.session().put(TABLE, getBytes("person:1gname"), getBytes(UUID.randomUUID().toString()));
        this.rocks.session().put(TABLE, getBytes("person:1gage"),
                                 getBytes(new Random().nextInt(30)));
        this.rocks.session().put(TABLE, getBytes("person:1gcity"), getBytes(UUID.randomUUID().toString()));
        this.commit();
    }

run testPutAndGet() once, the rocksdb catelog like this ├── 000005.log ├── CURRENT ├── IDENTITY ├── LOCK ├── LOG ├── MANIFEST-000004 ├── OPTIONS-000009 └── OPTIONS-000011 run testInsert, the rocksdb catelog like this ├── 000027.log ├── 000029.log ├── CURRENT ├── IDENTITY ├── LOCK ├── LOG ├── LOG.old.1710939849840968 ├── MANIFEST-000013 ├── OPTIONS-000018 └── OPTIONS-000031 Why is the sst file not generated in the end?

sheli00 commented 5 months ago

@dosu-bot. Another question about rocksdb write procedure. On branch master, I change org.apache.hugegraph.unit.rocksdb.RocksDBSessionTest like below

    @Test
    public void testInsert() {
        for (int i = 0; i < 5000000; i++) {
            testPutAndGet();
        }
    }
    @Test
    public void testPutAndGet() {
        String value = getString(this.rocks.session().get(TABLE, getBytes("person:1gname")));

        this.rocks.session().put(TABLE, getBytes("person:1gname"), getBytes(UUID.randomUUID().toString()));
        this.rocks.session().put(TABLE, getBytes("person:1gage"),
                                 getBytes(new Random().nextInt(30)));
        this.rocks.session().put(TABLE, getBytes("person:1gcity"), getBytes(UUID.randomUUID().toString()));
        this.commit();
    }

run testPutAndGet() once, the rocksdb catelog like this ├── 000005.log ├── CURRENT ├── IDENTITY ├── LOCK ├── LOG ├── MANIFEST-000004 ├── OPTIONS-000009 └── OPTIONS-000011 run testInsert, the rocksdb catelog like this ├── 000027.log ├── 000029.log ├── CURRENT ├── IDENTITY ├── LOCK ├── LOG ├── LOG.old.1710939849840968 ├── MANIFEST-000013 ├── OPTIONS-000018 └── OPTIONS-000031 Why is the sst file not generated in the end?

dosubot[bot] commented 5 months ago

The behavior you're observing is due to how RocksDB handles its write procedure. When you're running the testInsert() method, you're performing a large number of write operations. However, these operations are not immediately flushed to the SST files. Instead, they are first written to a log file (the .log files you're seeing in the catalog).

RocksDB uses a structure called a MemTable to store these write operations in memory. When the MemTable is full, it is flushed to disk as an SST file. The size of the MemTable is controlled by the write_buffer_size configuration option in RocksDB. If you're not seeing SST files, it's likely that the MemTable has not yet been filled and thus not flushed to an SST file.

In your case, the testInsert() method is likely not filling up the MemTable, hence no SST files are being created. The additional log files you're seeing (000027.log, 000029.log) are a result of the write operations being stored in new log files once the previous one reaches a certain size (also a configurable option in RocksDB).

If you want to force a flush to create an SST file, you can manually call the flush() method on the RocksDB instance. However, please note that frequent manual flushing can degrade the performance of RocksDB, as it is designed to handle flushing automatically in a way that optimizes for both read and write performance.

For more information on how RocksDB handles write operations, you can refer to the RocksDB Wiki.

I hope this helps clarify the behavior you're seeing. If you have any more questions, feel free to ask.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

sheli00 commented 5 months ago

@dosu-bot. Referring to this file org.apache.hugegraph.unit.rocksdb.RocksDBSessionTest, help me write an example that can generate an sst file?

dosubot[bot] commented 5 months ago

To generate an SST file by filling up the MemTable or manually flushing it in RocksDB, you can modify the testPutAndGet() method in the RocksDBSessionTest class. This can be done by using the flush() method provided by RocksDB's API. Here's an example of how you could modify the testPutAndGet() method:

@Test
public void testPutAndGet() throws RocksDBException {
    // Assuming rocks.session() provides access to the underlying RocksDB instance
    // or exposes a method to flush the MemTable.

    String value = getString(this.rocks.session().get(TABLE, getBytes("person:1gname")));
    Assert.assertNull(value);

    this.rocks.session().put(TABLE, getBytes("person:1gname"), getBytes("James"));
    this.rocks.session().put(TABLE, getBytes("person:1gage"), getBytes(19));
    this.rocks.session().put(TABLE, getBytes("person:1gcity"), getBytes("Beijing"));
    this.commit();

    // Manually flush the MemTable to generate an SST file.
    // This assumes there's a method to flush or you have direct access to the RocksDB instance.
    this.rocks.session().flush();

    value = getString(this.rocks.session().get(TABLE, getBytes("person:1gname")));
    Assert.assertEquals("James", value);

    long age = getLong(this.rocks.session().get(TABLE, getBytes("person:1gage")));
    Assert.assertEquals(19, age);

    value = getString(this.rocks.session().get(TABLE, getBytes("person:1gcity")));
    Assert.assertEquals("Beijing", value);
}

Please note that the flush() method call is shown conceptually. The actual implementation depends on how the RocksDB instance is managed within your rocks.session() object. If rocks.session() does not directly expose a flush() method, you may need to add such functionality to your session management code, ensuring it properly calls flush() on the underlying RocksDB instance.

This modification manually triggers a flush of the MemTable, which should result in the creation of an SST file. However, keep in mind that frequently flushing manually may affect performance and is typically managed automatically by RocksDB based on its configuration settings.

Regarding the testInsert() method, it's not directly present in the provided context. Therefore, it's not possible to provide details or code for a method named testInsert() as it either does not exist or is not included in the provided context. The same approach as shown above can be applied to any other method where you want to trigger a flush operation.

For more details, you can refer to the RocksDBSessionTest.java file in the HugeGraph repository.

Sources

hugegraph-server/hugegraph-test/src/main/java/org/apache/hugegraph/unit/rocksdb/RocksDBSessionTest.java

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

hugegraph / hugegraph