sryza / spark-ts-examples

Spark TS Examples
Apache License 2.0
120 stars 79 forks source link

Updates for Spark 1.6.0 #2

Closed pegli closed 8 years ago

pegli commented 8 years ago

I had some trouble with the JavaStocks sample using Spark 1.6. Initially, the sample app would throw an exception while parsing the dates for the DateTimeIndex:

Exception in thread "main" java.time.format.DateTimeParseException: Text '2015-08-03' could not be parsed at index 10
at java.time.format.DateTimeFormatter.parseResolved0(DateTimeFormatter.java:1949)
at java.time.format.DateTimeFormatter.parse(DateTimeFormatter.java:1851)
....

I grabbed the index creation code from the Scala version to fix that, then ran into a problem with the Comparators that are used to get max and min from dwStats:

Exception in thread "main" org.apache.spark.SparkException: Task not serializable
....
Caused by: java.io.NotSerializableException: com.cloudera.tsexamples.JavaStocks$$Lambda$28/96749807

I added a new class that implements both Comparator and Serializable and used that to sort the stats results. At this point, the demo runs but gives min and max results of (AAL,NaN). The Scala demo gives the correct results. I'll update the PR if I can figure out why the Java version isn't working properly.

sryza commented 8 years ago

@pegli thanks for the contribution! Left a couple small questions on the code. From a cursory look, it's not obvious to me why the NaNs are being returned. I can merge it without fixing the NaNs though, as its still an improvement without.

pegli commented 8 years ago

I fixed the style issues you raised but haven't yet had the time to try to figure out the NaN issue.

sryza commented 8 years ago

Looks good, merging this.