ligurio / unreliablefs

A FUSE-based fault injection filesystem.
https://ligurio.github.io/unreliablefs/unreliablefs.1.html
MIT License
173 stars 9 forks source link

Test 1-byte reads #86

Open vinipsmaker opened 3 years ago

vinipsmaker commented 3 years ago

Could there be an option where read() only reads 1 byte at a time? It'd be useful to test buffer management routines on parsers.

ligurio commented 3 years ago

Hi! It's a good idea! Could you tell more about used scenario? What parsers do you test?

vinipsmaker commented 3 years ago

Could you tell more about used scenario?

Sure.

I had the same issue in the past, but then I'd just mock the read routines and force 1-byte reads through the mocks. I don't like this approach as effectively I won't be testing the same code going into production (the read routines change).

A few hours ago I was facing this issue again and I thought of searching for a new solution and FUSE seemed like a good idea and I've stumbled upon your project. So, “why not ask?” is what I've thought. For my current project, I have a routine such as follows:

for (;;) {
    std::string_view buffer_view(buffer.data(), buffer_used);
    auto lf = buffer_view.find('\n');
    if (lf != std::string_view::npos) {
        line = buffer_view.substr(0, lf + 1);
        break;
    }

    if (buffer.size() == buffer_used)
        buffer.resize(buffer.size() * 2);

    ssize_t nread = read(iobuf->fd,
                         const_cast<char*>(buffer.data()) + buffer_used,
                         buffer.size() - buffer_used);
    if (nread == -1) {
        *errcode = errno;
        return EOF;
    }
    if (nread == 0) {
        if (buffer_used == 0)
            return EOF;

        line = std::string_view{buffer.data(), buffer_used};
        break;
    }
    buffer_used += nread;
}

In short, I read LF-delimited records. The code is very simple and I'm not afraid of it. However I'll have to change this read-loop to use a chunked JSON parser to find the record delimiter. The chunked JSON parser maintains pointers to the stream. If I fuck it up, the parser will keep references to invalid memory. That's a much more complex beast.

I'd like to run the tests ensuring that every read() call on the target fd/file only performs 1-byte reads. That'll make sure the parser stops at every token to ask for more bytes and make sure I handle incomplete streams at every possible layer of my code (triggering new correct reads as needed).

ligurio commented 3 years ago

I'll try to include it in the next release.

vinipsmaker commented 3 years ago

Thanks.