ColinOOOOOh / fmSystem

graduate design
1 stars 0 forks source link

Is it significant to rewrite HDFS, Hadoop/Spark or HBase in C/C++ to improve computational efficiency? #3

Open DaviesX opened 7 years ago

DaviesX commented 7 years ago

Is it significant to rewrite HDFS, Hadoop/Spark or HBase in C/C++ to improve computational efficiency?

As we all know, Java or Scala is very slow compared to C/C++. I wonder whether it is significant to rewrite the low level implementation in C/C++, and still keep the user interface in a high level language.

Yes.

It is very significant. At MapR we have done exactly this with fairly impressive results. We used C++, but in a very Google-ish subset style. Cutting to the final results, the benefits are:

1) much higher performance in terms of bandwidth and throughput

2) much lower and more predictable latency

3) the re-architecture involved in re-implementing HDFS allowed us to rethink many of the limitations that HDFS has such as the inherent single point of failure represented by the name-node, the fact that files cannot be updated randomly, very limited file commit rate.

It is impossible to completely disentangle the question of how much benefit the language choice made versus how much benefit the ground up re-architecture effort had, but here are some key points that relate to language:

On the other hand, coding at a lower level is more expensive and can be more error-prone if you don't have the right coders. We have the right ones at MapR, but most places don't.

The overall design of the system is that the core file system (which includes the distributed file system and the noSQL database implementation) is written in C++. Wire protocols are built using protobufs uniformly and clients are available in C or Java. In some cases, Java client code uses JNI to access shared memory efficiently.

The results are very impressive. For instance, HDFS can commit meta-data changes a few hundred times per second. In a large minute-sort benchmark, a MapR system was clocked at about 15 million meta-data updates per second. These included file creation, deletion and so on.

Another area of improvement is in overall scalability. Whereas HDFS limits you to around 100 million files for a large cluster and less for smaller clusters, MapR clusters have run with billions of files and no performance degradation.

In production, these benefits typically manifest in the form of smaller clusters required to meet a throughput deliverable with noticeably more predictable and small latencies.

One primary goal for the development was to present the cluster as a fully functional conventional storage system so that legacy code can work interchangeably with Hadoop code. This is done first by exposing a fully distributed NFS interface as well as the standard HDFS API. In addition, MapR systems implement completely atomic snapshots and mirrors. Snapshots capture the instantaneous state of the file system while mirrors propagate that state to a remote replica.

This answer is getting kind of long, but if you have specific questions, I can expand it as needed. 17k Views · 164 Upvotes · View Timeline

DaviesX commented 7 years ago

Quantcast is using C/C++ based distributed file system, an almost drop-in replacement for HDFS. They seems to be managing few peta bytes of data using QFS. You can refer their Git wiki for comparison with HDFS. QFS is based on Kosmos file system.

https://github.com/quantcast/qfs/wiki

qfs-ovsiannikov.pdf