manuzhang / read-it-now

Don't read it later; read it now
4 stars 0 forks source link

Merger of Cloudera and Hortonworks #29

Open manuzhang opened 6 years ago

manuzhang commented 6 years ago

Breaking news last week. Cloudera and Hortonworks announced merger with Cloudera stockholders taking 60% of equity and Cloudera's CEO Tom Reilly becoming the new CEO of the combined company. (It seems Cloudera has got a better name 😅)

In statement of Tom Reilly,

By bringing together Hortonworks’ investments in end-to-end data management with Cloudera’s investments in data warehousing and machine learning, we will deliver the industry’s first enterprise data cloud from the Edge to AI

However, CEO of MapR, John Schroeder didn't think so

The merger is about cutting costs...The merger announcement says these redundant technologies will be "unified," meaning some will be discontinued, causing customers undue switching cost pain

According to John Schroeder, two companies' bet on commodity Hadoop fails to support the demand of advanced AI and analytics while MapR delivers today.

Simply put, commodity Hadoop falls short of today's customer needs when it comes to AI, analytics, and the cloud, and this makes it hard to find a path to profitability and sustained growth.

Today we already have MapR deployed on oil rigs and in medical devices. A quarter of MapR business is in the public cloud.

While most comments are positive on Twitter.

I would take a different perspective from John Schroeder. The merger cuts costs for open source community. No more redundant technologies mean unified forces and engineers behind open source projects, which would evolve much faster. The big elephant was too slow.

manuzhang commented 6 years ago

Follow-up from Datanami,

Cloudera executives have stated that customers running the latest releases of CDH, HDP, and Hortonworks DataFlow (HDF) will be fully supported for “at least three years.”

Cloudera plans a "unity" release that combines CDH and HDP

image

The future of current building blocks of Hadoop platform, YARN and HDFS, looks in doubt

the market momentum behind Kubernetes is so great that the containerized technology has essentially already been declared the de-facto resource manager of the future

the cost and scalability advantages of object stores has become too great to ignore anymore, so eventually on-premise big data clusters will utilize an object store with an S3 API

Can HDFS evolve itself to fit Cloud usage with efforts like Ozone ? Meanwhile, Linkedin has extended YARN's capability of machine learning with TonY.