Open Harshit22 opened 4 years ago
@Harshit22 it works for local files and directories, assuming the "local" doesn't have HDFS setup. Are you running it from one of the cluster machines with HDFS configured? The primary reason to support HDFS was to ensure that while running sparklens with spark application, one can save the sparklens json file to known s3 or HDFS location, which is useful if one doesn't have ssh access to machine running the driver.
EventHistoryToSparklensJson class treats input events file argument as local file or directory. However, EventHistoryReporter class, used internally, reads it as HDFS file.
This makes both local and HDFS events file unusable with EventHistoryToSparklensJson. Doc mentions that input file should be local path.
To circumvent this issue, I had to keep events file in both local and HDFS filesystems at identical paths.
Jar used: https://mvnrepository.com/artifact/qubole/sparklens/0.3.1-s_2.11 Java 8/Scala 2.11/Spark 2.4.3/AWS EMR