Open lucperkins opened 8 years ago
The simpler architecture allows for a better horizontal scaling. Unlike hadoop it totally avoid locks with can become a headache once a few thousand instances are running in parallel.
This remdinds me, we should of course spawn a new gen_server
for each request in #3
I have been contemplating this issue for a while. Can we solve it with a Map/Reduce framework over the data, or do we need to employ a microbatching solution based on Apache Kafka?
What version of Hadoop do you have in mind?