etcd-io / bbolt

An embedded key/value database for Go.
https://go.etcd.io/bbolt
MIT License
8.06k stars 628 forks source link

In-memory read-only database implementation #227

Open jdevelop opened 4 years ago

jdevelop commented 4 years ago

I need to "embed" some key/value data into my go app ( packr/rice etc ). The database is to be used for prefix lookups, read-only.

Bbolt does require to have *os.File instance to work with, I wonder if it is possible to generalize it to something more abstract so I can supply []byte and that will do?

As a workaround I create a file in /tmp upon startup if it doesn't exist, then copy the content into that file and then supply this file to bbolt.Open, but that is something I'd rather try to avoid.

ptabor commented 4 years ago

How about more lightweight solutions, like https://github.com/google/btree. Is it important for your use-case to keep the data embedded as a serialized buffer ?

jdevelop commented 4 years ago

@ptabor thanks for the quick response. Yes, for my use-case I need to embed the list of hardware I can work with into the executable. Building it upon startup might not be feasible under some circumstances. It would be ideal if I can just point to some byte buffer and pretend that it is Bolt database. Range / prefix queries are quite useful to have.

xiang90 commented 4 years ago

As a workaround I create a file in /tmp upon startup if it doesn't exist, then copy the content into that file and then supply this file to bbolt.Open, but that is something I'd rather try to avoid.

That is a fine idea. It is also helpful to accelerate testing. You probably can find some existing solutions to mock the os.file api to begin with.

jdevelop commented 4 years ago

Would be nice to have that openFile func(string, int, os.FileMode) (*os.File, error) function return something like io.ReaderWriterSeeker instead of *os.File which is a pointer to a struct. That way it can be easily configured to work with any random-access storage type ( files, byte slices etc )

denisvmedia commented 3 years ago

My idea was to use something like the following:

import (
    "github.com/spf13/afero"
)

// ...
    fs := afero.NewMemMapFs()
    options := bolt.Options{
        OpenFile:  func(name string, flag int, perm os.FileMode) (*os.File, error) {
            f, err := fs.OpenFile(name, flag, perm)
            if err != nil {
                return nil, err
            }
            return f, nil
        }
    }
// ...

But the thing is that afero uses an interface instead of a real *os.File:

type File interface {
    io.Closer
    io.Reader
    io.ReaderAt
    io.Seeker
    io.Writer
    io.WriterAt

    Name() string
    Readdir(count int) ([]os.FileInfo, error)
    Readdirnames(n int) ([]string, error)
    Stat() (os.FileInfo, error)
    Sync() error
    Truncate(size int64) error
    WriteString(s string) (ret int, err error)
}

So, I was thinking, if boltdb could actually change to the same interface. It seems that this interface is 100% compatible with *os.File.

UPDATE: this is not actually easily possible, because bbolt uses Fd() function of the File and does some very specific os-dependent operations using the real file descriptor...

missinglink commented 3 years ago

It would be pretty simple to add support for MAP_ANONYMOUS (ie mmap not backed by a file).

Without changing the interfaces, this could be achieved by passing nil for *os.File and then putting guard statements in the appropriate places where the db.file.* methods are called.

In the case of a nil file, the correct FD to use is -1 and the flags should be syscall.MAP_PRIVATE | syscall.MAP_ANON. Other than that it should "just work" including calls to Mmap, Madvise etc, obviously Truncate and Sync wouldn't apply here.

https://man7.org/linux/man-pages/man2/mmap.2.html

@denisvmedia could you possibly replace afero with a call to shm_open?

denisvmedia commented 3 years ago

@missinglink thanks for your suggestion. It can work, but unfortunately it is not a universal/cross-platform solution.

missinglink commented 3 years ago

Agh that's true shm_open isn't portable, it would be nice to add support for in-memory databases, such as SQLite supports by passing the string ':memory:' as the filepath.

It would also make testing a lot easier, eg https://github.com/etcd-io/bbolt/blob/master/cmd/bbolt/main_test#L265-L277 could be replaced with code which doesn't touch the file system.

missinglink commented 2 years ago

Is this something people are interested in I can open a PR to add it?

jdevelop commented 2 years ago

@missinglink definitely I would still need this feature to use in-memory data store. Please go ahead :)

missinglink commented 2 years ago

Okay I'll look into drafting a PR, it will need a bit of discussion about the specifics but I believe the use of MAP_ANONYMOUS will make the feature pretty much interoperable with the existing file-attached one without too many code changes.

I'm not familiar with Windows, I know it can work but will need additional testing there by someone with a Windows computer.

Worth mentioning that the title for this issue is a bit of a misnomer as an in-memory database would always need to be writable due to its ephemeral nature (if you dont write to it there is nothing to read).

denisvmedia commented 2 years ago

I'm not familiar with Windows, I know it can work but will need additional testing there by someone with a Windows computer.

Actually it's possible to test directly on github using github actions. They support linux (ubuntu), mac and windows.

KastenMike commented 2 years ago

+1 for making OpenFile work with something more generic like afero's File so it's possible to use bolt with their in-memory option

mbrt commented 2 years ago

What about providing a higher level interface to separate database level from filesystem level concerns?

This is certainly more work, but the result would be cleaner. Something like:

And leaving the lower level OS specifics to concrete implementations. This would allow an in memory implementation without having to provide low level primitives such as Fd(). The higher level should only care about opening, updating and reading files and locking / unlocking them.

Caveat: I haven't actually tried this approach yet, so it might be infeasible.

missinglink commented 2 years ago

I had another look at this today and made some progress with a new interface called Backing.

The concrete implementations of can be one of either FileBacking or MemoryBacking. These correspond to the two mmap modes MAP_FILE and MAP_ANONYMOUS.

It's not as simple as I would have hoped simply because I think MAP_ANONYMOUS wasn't considered during the initial development, so adding it now without changing method signatures is tricky, but not impossible.

I also had a quick look at afero, it's not suitable as a replacement since it simply stores the data in a go map on the heap rather than using the mmap syscall API: https://github.com/spf13/afero/blob/master/memmap.go#L32

missinglink commented 2 years ago
type backing interface {
    Fd() int
    Fsync() error
    Truncate(int64) error
    Sync() error
    Flock(bool, time.Duration) error
    File() *os.File
    Open(string, os.FileMode, *Options) error
    Close() error
    ShouldInit() (bool, error)
}

type memoryBacking struct{}

func (mb memoryBacking) Fd() int                                        { return -1 }
func (mb memoryBacking) Fsync() error                                   { return nil }
func (mb memoryBacking) Truncate(size int64) error                      { return nil }
func (mb memoryBacking) Sync() error                                    { return nil }
func (mb memoryBacking) Flock(e bool, t time.Duration) error            { return nil }
func (mb memoryBacking) File() *os.File                                 { return nil }
func (mb memoryBacking) Open(p string, m os.FileMode, o *Options) error { return nil }
func (mb memoryBacking) Close() error                                   { return nil }
func (mb memoryBacking) ShouldInit() (bool, error)                      { return true, nil }

type fileBacking struct {
    db   *DB
    file *os.File
}

func (fb fileBacking) Fd() int                             { return int(fb.file.Fd()) }
func (fb fileBacking) Fsync() error                        { return fdatasync(fb.db) }
func (fb fileBacking) Truncate(size int64) error           { return fb.file.Truncate(size) }
func (fb fileBacking) Sync() error                         { return fb.file.Sync() }
func (fb fileBacking) Flock(e bool, t time.Duration) error { return flock(fb.db, e, t) }
func (fb fileBacking) File() *os.File                      { return fb.file }

plus implementations of <Open> and <Close> which are more verbose
denisvmedia commented 2 years ago

Well, one of the use-cases on a potential in-memory storage are unit tests. For them we would probably not need any syscall...

missinglink commented 2 years ago

Well, one of the use-cases in-memory storage are unit tests. For them we would probably not need any syscall...

Yeah that's true, but people familiar with the mmap syscall API would assume that you could map a segment of memory larger than available RAM and that the OS would transparently handle this for you, this isn't the case for a solution where the bytes are held in heap memory in-process.

Particularly for testing it would be better to use the same storage engine since otherwise we wouldn't be able to guarantee that the behaviour which was tested would be exactly the same using the native mmap API in production.

denisvmedia commented 2 years ago

Particularly for testing it would be better to use the same storage engine since otherwise we wouldn't be able to guarantee that the behaviour which was tested would be exactly the same using the native mmap API in production.

In general, yes, but if the app relies on (in other words trusts) the behavior of BoltDB, it probably can assume it won't be 100% accurate in the tests. Anyway, I get your point, and your arguments make sense as well.

Another concern here would be - is it actually cross-platform? Will it run on Windows in your proposal?

missinglink commented 2 years ago

Another concern here would be - is it actually cross-platform? Will it run on Windows in your proposal?

Yes, the Windows equivalent of mmap is called MapViewOfFile and has an equivalent mode to MAP_ANONYMOUS.

mbrt commented 2 years ago

Particularly for testing it would be better to use the same storage engine since otherwise we wouldn't be able to guarantee that the behaviour which was tested would be exactly the same using the native mmap API in production.

Yeah true. My only concern would be that the interface could be overly restrictive / unnecessarily complicated for non-test use cases. For example, I was toying with the idea of using cloud storage as a backing store [I know it sounds like a stupid idea :)]. The mmap solution wouldn't work in that case and also seems unnatural for the OP's use case.

I'm also worried about the leaky abstraction of Fd(), where you return -1 and expect the caller to know what to do about it. It's oftentimes a bad sign.

Anyway, just my 2 cents.

missinglink commented 2 years ago

Agh I see, there's two different approaches here.

I think you're talking about providing a storage adapter pattern which would allow persistence to abitrary 'file-like-things'

What I'm talking about is just extending the existing implementation to allow different flags and therefore not require the database be backed by an actual file.

The -1 convention for the fd is inherited from Unix IIRC although it doesn't fit in a uintptr, so maybe I'm wrong about that, I forget where I read it, in a man page somewhere.

The thing about general purpose storage engines is that they need to be 'block devices' or imitate one, so it need to read/write 4kb page blocks.

The mmap API is ideal for this since it transparently caches pages in the filesystem cache meaning that often pages are in RAM rather on disk, which is a huge performance benefit that isn't available when using other storage engines.

There are some b-tree operations that can be quite heavy on the pager such as balance/merge operations, I suspect these would be very slow using something like HTTP range requests and PUT requests.

I think the two concerns are actually better implemented at different levels. The mmap block storage adapter can be modified to support anonymous backings while separately the concept of a block storage adapter can also be converted to an adapter pattern.

Sounds like a lot of work, and considering how hard it is to get a typo fix merged on this repo I don't see it being worth the effort 🤷‍♂️

mbrt commented 2 years ago

I think the two concerns are actually better implemented at different levels.

Fair point.

missinglink commented 2 years ago

I found this commit which seems to implement exactly this functionality:

https://github.com/boltdb/bolt/commit/ed31a3bd0058b926bb17cae9293a8af6e6f1c066

0x0177b11f commented 2 years ago

320 modified from commit https://github.com/boltdb/bolt/commit/ed31a3bd0058b926bb17cae9293a8af6e6f1c066

add simulate test, maybe need more test

this project need ci 😭

benma commented 1 year ago

Would be nice to have that openFile func(string, int, os.FileMode) (*os.File, error) function return something like io.ReaderWriterSeeker instead of *os.File which is a pointer to a struct. That way it can be easily configured to work with any random-access storage type ( files, byte slices etc )

This would be great. I am interested in this so I can create an overlay filesystem that encrypts the database before writing it to the underlying fileystem.

Elbehery commented 8 months ago

is anyone assigned to this ?

SaintWish commented 6 months ago

I would like to see this feature as well mainly for speeding up tests. If no one else is willing to take on the task I could attempt it, though IDK much about writing in-memory stuff.