ByteStorage / FlyDB

The high-performance kv storage engine based on bitcask paper made in golang
Apache License 2.0
1.25k stars 100 forks source link

Insert error when using client for different data structures or restarting server #294

Open Uncle-Justice opened 1 year ago

Uncle-Justice commented 1 year ago
  1. I cant use various data structures as a single client
uj001@uj001:~/flydb$ ./bin/flydb-client 127.0.0.1:8999

              ______    __             ____     ____ 
             / ____/   / /   __  __   / __ \   / __ )
            / /       / /   / / / /  / / / /  / /_/ /
           / /_      / /   / / / /  / / / /  / __ |
          / __/     / /   / /_/ /  / / / /  / / / / 
         / /       / /    \__, /  / /_/ /  / /_/ /  
        /_/       /_/    ,__/ /  /_____/  /_____/  
                        /____/                                              

flydb $> Sadd s 1
SAdd data success
flydb $> Sadd s 112
SAdd data success
flydb $> Sadd s1 112
SAdd data success
flydb $> put dsd 1
put data error:  client put failed: rpc error: code = Unknown desc = exceed file max content length
error: client put failed: rpc error: code = Unknown desc = exceed file max content length
  1. after i kill(ctrl+c) the server and restart, client shows the same error
github-actions[bot] commented 1 year ago

Thank you very much for your issue and we will discuss it.

qishenonly commented 1 year ago

The current utilization of the Mmap index remains relatively unstable. Switching to the new data structure is not always prompt, and reserved space may not be trimmed in a timely fashion when abnormal blocks are encountered. Consequently, this leads to errors exceeding the established file threshold.

We recommend using FileIO for the corresponding deployment and development.

Uncle-Justice commented 1 year ago

thanks for your reply, but im still little confused about your description

  1. about mmap index. In db.go, func NewDB loads datafiles into memory in a mmap way, then it just walks through all the datafiles or hintfile to build the index in memory(loadIndexFromDataFiles, loadIndexFromHintFile). So index seems to be totally born and raised as ART in memory instead of mmapping any "index file". I'm not sure if there's something I didn't understand clearly or missed about this.
  2. if you are actually talking about activefile, what's the difference about structures in the perspective of activefile? For example, if i wanna use Sadd, I just get the whole set from the db files by a key(checkAndGetSet), add sth into it, and Put the whole set back into the activefile as an appendlog(setSetToDB). So do other structures. Bitcask seems to just take bytes as value(db.Put), no matter how it means.
  3. about abnormal blocks, could you please be more specific about this situation?
  4. i did tried FileIO option and it seemed to work. I wonder whether this design(mmapIO vs fileIO) is for security reasons or for performance reasons.
qishenonly commented 1 year ago

Thank you for your questions, and I would be delighted to provide clarifications and answers to address your concerns:

  1. In terms of indexing issues, FlyDB was originally designed to have more efficient read and write speed and data persistence security. So we chose to use the in-memory index in conjunction with the disk index. Mmapcan map index files and data files to memory, and then access data efficiently through ART. ART is only a simple memory index, and cannot directly access the corresponding data index from the disk.
  2. I suppose you are confused about the different data structures stored in bytes in activefile? At this point, it is actually obvious that in actual files, we use bytecode as a form of storage for the database. db.Put is the lowest level interface, the various methods of the data structure is the interface of the upper layer, by calling db.Put to convert the serialized data structure into the form of bytes to store, and then deserialize when reading. If I misunderstood your question, please explain in more detail!
  3. Regarding abnormal blocks, my understanding is that data blocks may have become corrupted, lost, or compromised, especially on the storage side, which could lead to data loss, data corruption, or the inability to open files. At the hardware level, there are factors beyond our direct control and consideration. However, on the software side, we can take measures to address and prevent these issues. In the early conceptualization of FlyDB, we have already taken these issues into account and implemented certain security measures. For instance, we have added CRCchecks to the data to promptly detect and report errors in cases of data loss or tampering, ensuring data integrity. Furthermore, we have plans for future development to introduce data backup and recovery mechanisms, which will better address issues like data loss and data corruption. Backup and recovery will assist us in swiftly restoring data in the event of abnormal blocks, enhancing data availability and reliability.
  4. Regarding FileIO, this is the fundamental file IO system initially designed by our development team. It offers greater stability and a lower risk of security issues but has a slight performance disadvantage. To address this performance concern, we introduced mmapIO, which provides improved performance. Unfortunately, the development team working on mmapIOfaced challenges in runcate the reserved space before blocking, and as of now, there are no ideal solutions on the table.

If you have any further inquiries or uncertainties, please do not hesitate to ask questions. We are here to assist you!

Uncle-Justice commented 1 year ago

Thank you for your generous reply

  1. mmap index

    So we chose to use the in-memory index in conjunction with the disk index. MMap can map index files and data files to memory, and then access data efficiently through ART.

In the code, I do find data files mmapped in memory, but i dont find index files be mmapped. Of course i can see index in memory as an ART and in disk as hint files, but in the code, they convert to each other(func loadIndexFromHintFile & Merge), instead of being mmapped using the system call mmap. Meanwhile, datafiles do use mmap tech to execute read & write between memory and disk.

As you say, ART and hint file have totally different structures or representation forms, so surely they cannot be mmapped.

It is not a big deal. Maybe it's just a difference in terms between you & me.

  1. different structures

Under my logic, since db does not care about specific structures, we should be able to operate various data structures easily in flydb cli, just like in redis. But in this issue, i failed to do this.

  1. abnormal block

I fully understand it now, tks. So it seems irrelevant to what I'm confusing on right now.

  1. mmapIO vs fileIO

the development team working on mmapIO faced challenges in runcate the reserved space before blocking

im not sure about "reserved space before blocking". Do you mean blocking when Put sth into the active file concurrently? Since i can calculate the size of the target logrecord, what am I truncating reserved space for?

Uncle-Justice commented 1 year ago

At first, I thought you had to use mmapIO to cope with some special property of bitcask. Now after you give me more specific descriptions, i think the core of this issue is how to use mmap properly or some other design issues.

As far as I'm concerned with reading and debugging this project, I wanna start by describing my understanding of the overall code logic to ensure that everyone is on the same page.

In the initial stage of server, we init every structure service(NewStringService, NewHashService, etc.). We new a db instance for each structure. For example, the db instance of string is not the same instance as set or other structures. However, these db instances share the same activefile path.

Since structures have their own db instances, their activefiles and IOManagers are not the same. But the path of their active files are the same, while they dont share the same offset in MMapManager! It seems strange to me.

Then my issue happens, when i execute Sadd, the set's db instance starts to Put kv. This triggers creating an active file for the set's db, according to the default options. After creating an activefile named /tmp/000000000.data, the set's MMapIOManager init its offset to the active file's size, which is 0 now.

Then i execute put s1 111 in cli, the string's db instance starts to Put kv. According to the default options, it finds an activefile named /tmp/000000000.data. Then the string's MMAPIOManager inits its offset to the active file's size, which is 256MB now. string's MMAPIOManager will report the error because the offset has reached the threshold.


In conclusion, i'm confused about the design of service for structures. I think the right design should be one of the following:

  1. every structure's got its own db, own active/older/index files. So at least in cli, i can use different structures straightly just as in redis. But this design may increase the cost of maintaining the whole system, because the directory becomes more complicated.
  2. every strucure's got its own db, but share active/older/index files. And they share the mmap offset (or dynamic writePos) in IOManager, which seems not implemented in the project. This reminds me of the ref count mechanism in std::shared_ptr(C++11). This mechanism seems efficient here as well.
sjcsjc123 commented 1 year ago

Normally, data files with different data structures are separated, at least at the web level. Due to historical legacy reasons, cli has not been updated in a timely manner; In addition, we are considering consolidating all configuration data structure information into the same configuration file, and the cli may need to be changed to a case insensitive approach. Therefore, a major update may be required at the cli level. Mmap still has an issue of being incompatible with Windows.

Uncle-Justice commented 1 year ago

Glad to follow the further developments with committing code.