thunder-project / thunder

scalable analysis of images and time series
http://thunder-project.org
Apache License 2.0
814 stars 184 forks source link

use more generic binary series loader #362

Open freeman-lab opened 8 years ago

freeman-lab commented 8 years ago

Currently to read distributed series binary data we use the spark sc.binaryRecords method. This depends on Hadoop, and may have unexpected ordering behavior across different systems. We should probably just write a method that maps the list of file names and reads the files directly, just as we read binary images.

cc @jwittenbach