quantcast / qfs

Quantcast File System
https://quantcast.atlassian.net
Apache License 2.0
643 stars 169 forks source link

qfs use cases #241

Open jeromew opened 4 years ago

jeromew commented 4 years ago

Hello,

I understand that qfs was developed over the years to replace hdfs with better characteristics for the mapReduce use case over very large files.

I am looking into qfs for building a clustered fs out of low end commodity vps (e.g. vps with 1Go Ram, 1 vCPU, 20Go disk) and am wondering if the following use cases are possible and the limitations that I should expect

Thanks for your help (I hope it is ok to ask this question here on the issue tracker. Do not hesitate to tell me if there is a better place for asking such questions)

mikeov commented 4 years ago

Chunk header size is 16KiB. Each non empty file would have at least 1 chunk, therefore disk storage overhead for files less than 64MB would be minimum 16KiB with replication 1. Replication obviously acts as multiplier. With Reed Solomon (RS) encoding the overhead depends on the file and stripe size as recovery stripes are padded. For example 1 byte file with 6+3 RS would occupy (16KiB + 1) * 4. The minimum support stripe size is 4KiB, the default 64KiB.

Meta server memory utilization in bytes: 72 + 104 + file_name_length.

Support for random writes is limited with replicated / non RS files. With RS files only sequential write is supported.