tarantool / small

Specialized memory allocators
Other
101 stars 22 forks source link

Create a BLOB database on top of it #9

Open olegabr opened 8 years ago

olegabr commented 8 years ago

If I take it correctly, these set of utilities provides a way to manipulate chunks of memory very efficiently.

I've faced a need to efficiently store a huge amount of small (about 20 kB) images and give a read access to it by the web server, nginx for example.

The problem is that having too many plain files are bad for performance, even if proper multilevel directory structure is used. On the other hand, nginx provides a sendfile functionality that makes file access very efficient.

So, my idea is to join many small image files into one huge file with gigabyte size and provide a way to address each individual file in it. Then an appropriate nginx module can be extended to use offset in the sendfile OS call. As a result we would have best of two worlds: sendfile performance and low file handlers count to not hurt filesystem and memory.

I'm writing it there because I think that these utilities is a good base for such a general-purpose BLOB database, and because there are people with a huge database creation background that can estimate the value of this idea.

Thank you!

kostja commented 8 years ago

Sure, you can do it, but why not store this stuff in Tarantool in the first place? You can use https://github.com/tarantool/nginx_upstream_module to serve them as files via nginx.

olegabr commented 8 years ago

First, tarantool is in-memory database. I'm thinking about terabytes and petabytes of images. Second, sendfile promises to be very fast because of kernel-level optimizations, primarily, avoid file data copying. Not sure if it can be achieved with the module you have mentioned.