basho / bitcask

because you need another a key/value storage engine
1.29k stars 173 forks source link

Refactor handling of file descriptors #199

Open slfritchie opened 10 years ago

slfritchie commented 10 years ago

Here's an issue to yank out of my todo emailbox and into a more difficult to forget forum. Bitcask's handling of large numbers of file descriptors should probably change. From @engelsanchez: "It would be easy to change to a dict or small -> list/big -> dict hybrid. Notice though that my code is reusing the same list manipulation that we use to look up files on a bitcask:get(), so I wouldn't expect any new surprises there."

engelsanchez commented 10 years ago

Thanks for moving from internal TODO to an issue Scott! For the record, Scott is talking about several places in Bitcask that handle file descriptors in lists. For example, on a get to fetch the descriptor for the file that contains the value. That makes all of these operations O(n) when they should be closer to O(1). This fact is probably hidden by any number of other inefficiencies, but there is no reason to keep it the way it is, except time to refactor and re-verify.

Some examples. There might be more: