peterbourgon / diskv

A disk-backed key-value store.
http://godoc.org/github.com/peterbourgon/diskv
MIT License
1.41k stars 102 forks source link

Suggestion: Improve performance by maintaining structs #43

Closed WinstonPrivacy closed 2 years ago

WinstonPrivacy commented 6 years ago

Currently diskv maintains in-memory values in []byte format. It would be excellent from a performance standpoint if arbitrary structs could be maintained instead as this would eliminate the overhead of deserialization. I suspect that the performance improvement would be 1-2 orders of magnitude depending on the application.

peterbourgon commented 6 years ago

I'm not sure how this makes sense. All structures need to eventually get serialized to []byte to make it to disk, there's no avoiding it. By using []byte we allow the user to use whatever serialization makes sense in their use case.

WinstonPrivacy commented 6 years ago

The key is that they eventually get serialized to disk. If you are reading and updating the key several hundred times a second or have decoupled the memory and disk (as we've done in our fork), this eliminates a lot of overhead.

peterbourgon commented 6 years ago

I understand. I guess what I'm saying is that your use-case may not be a good fit for the original design goals of diskv, which reduces cognitive overhead in users by not allowing for this decoupling. But if it's not too disruptive to the API, maybe it can be added. I'll have to make a judgment call :)

Oh, I thought I was replying to the other issue.

Yeah, this would definitely be a step too far. The only way to do this would be to store values as interface{}, or something custom like

type Serializable interface {
    Size() int64
    encoding.BinaryMarshaler
}

and I'm not interested in that.

peterbourgon commented 6 years ago

On second thought that second interface might work. The degenerate case would be

type SerializableBytes []byte

func (b SerializableBytes) Size() int64 {
    return len(b)
}

func (b SerializableBytes) MarshalBinary() ([]byte, error) {
    return b, nil
}
peterbourgon commented 6 years ago

Could you make your custom structs able to report their serialized size?

WinstonPrivacy commented 6 years ago

Hmmm... I'm not totally sure I'm following you (my fault, not yours). But I'm using GOB encoding, so I think the only way to figure out their size would be to serialize them, which defeats the point.

peterbourgon commented 6 years ago

We need to know size to be able to abide by CacheSizeMax requirement. I'm not aware of a way to get size of a struct without using unsafe.