sryza / spark-timeseries

A library for time series analysis on Apache Spark
Apache License 2.0
1.19k stars 424 forks source link

timeSeriesRDDFromCsv broken #134

Open ramyaragh opened 8 years ago

ramyaragh commented 8 years ago

timeIndex file is expected to be the same directory as the CSV files. This causes the timeIndex to be read into the RDD, which then fails since the format does not match the rest of the CSV files. To fix this, timeIndex should not be in the same directory as the CSV files or rdd = sc.textFile(path).map should check to not read timeIndex.

sryza commented 8 years ago

This looks like a legit bug to me