Inconsistent statistics on SparkUI

@jeyhunkarimov Hi Jeyhun,

I'm doing some benchmarking on the cluster and I'm confused about what information I get on the Spark UI. The setting is as following:

1) I'm using a textFileStream as input source 2) I'm copying one file of 1.6GB to hdfs, and Spark recoginzes the new file, but 3) If I check the "Stages" section, I see two stages with input 128MB

4) and if I check the "executors" section, I see the driver that has input 268.4MB

I'm confused about two reasons - firstly that 2 * 128MB != 268.4MB and secondly that I was expecting to see 1.6GB input instead of 268.4

Do you have any idea where I'm going wrong?

cristiprg / BDAPRO.GlobalStateML

Inconsistent statistics on SparkUI #15