qubole / rubix

Cache File System optimized for columnar formats and object stores
Apache License 2.0
183 stars 74 forks source link

how to use rubix in spark, jvm apps etc. #240

Closed manishmalhotrawork closed 5 years ago

manishmalhotrawork commented 5 years ago

example code, or test-cases documentation etc. difficult to adapt and test it, because of this.

abhishekdas99 commented 5 years ago

You can use RubiX for your spark applications by configuring the filesystems properly. The configs need to be set:

spark.hadoop.fs.s3.impl com.qubole.rubix.hadoop2.CachingS3AFileSystem spark.hadoop.fs.s3n.impl com.qubole.rubix.hadoop2.CachingS3AFileSystem spark.hadoop.fs.s3a.impl com.qubole.rubix.hadoop2.CachingS3AFileSystem

or

spark.hadoop.fs.s3.impl com.qubole.rubix.hadoop2.CachingS3AFileSystem spark.hadoop.fs.s3n.impl com.qubole.rubix.hadoop2.CachingS3AFileSystem spark.hadoop.fs.s3a.impl com.qubole.rubix.hadoop2.CachingS3AFileSystem