ShifuML / guagua

An iterative computing framework for both Hadoop MapReduce and Hadoop YARN.
https://github.com/ShifuML/guagua/wiki
Apache License 2.0
71 stars 40 forks source link
hadoop in-memory iterative machine-learning yarn

Guagua

Build Status
Maven Central

Guagua

An iterative computing framework on both Hadoop MapReduce and Hadoop YARN.

News

Guagua 0.7.7 is released with a lot of improvements. Check our changes

Conference

QCON Shanghai 2014 Slides

Getting Started

Please visit Guagua wiki site for tutorials.

What is Guagua?

Guagua, a sub-project of Shifu, is a distributed, pluggable and scalable iterative computing framework based on Hadoop MapReduce and YARN.

This graph shows the iterative computing process for Guagua.

Guagua Process

Typical use cases for Guagua are distributed machine learning model training based on Hadoop. By using Guagua, we implement distributed neural network algorithm which can reduce model training time from days to hours on 1TB data sets. Distributed neural network algorithm is based on Encog and Guagua. Any details please check our example source code.

Google Group

Please join Guagua group if questions, bugs or anything else.

Copyright and License

Copyright 2013-2017, PayPal Software Foundation under the Apache License V2.0.