1086-Maria-Big-Data / JobAdAnalytics

3 stars 2 forks source link

Concurrency #71

Closed willtsoft closed 3 years ago

willtsoft commented 3 years ago

http://www.russellspitzer.com/2017/02/27/Concurrency-In-Spark/ for concurrency in spark and it has a " A singleton object that controls the parallelism on a Single Executor JVM", also some code for Cassandra which Im not suggesting we add at this point

looks like repartitioning may be best and quickest fix as done in IndexUtil.write

I am trying this on EMR with entryLevel

vinceecws commented 3 years ago

A good place to start: http://www.russellspitzer.com/2017/02/27/Concurrency-In-Spark/

rustle003 commented 3 years ago

Insufficient time to explore as of this moment.