How to kill a node while Spark/Hadoop is running? -c + name / top command
Total time spent by all maps - Does it mean t(map1) + t(map2) or t(start of earlier of two - end of later of two) / linear, from start to end slots?
The glossary is not consistent - sometimes it is using node, but also use machine, is this not OK?
For page rank, is it really average? It’s not mentioning - Do not need to find out
One more experiment about PageRank: with Pregel (What the heck does this mean?)
How to count the repetition times? Different participants, total 3 times, is it 3 times? - no need to find out 20 rep for each
Watch out for the experiment participants!!!! The description were copied from the article, should we add reference there??? Paraphrasing would be okay.
The algorithm of pi program in Spark and Hadoop look different. What parameter to make them same?
Questions