How can I trans my own data into DF?

Hello @kceil ,

It is mentioned here: https://github.com/yahoo/CaffeOnSpark/wiki/GetStarted_EC2

pushd ${CAFFE_ON_SPARK}/data

hadoop fs -rm -r -f ${CAFFE_ON_SPARK}/data/mnist_train_dataframe
spark-submit --master ${MASTER_URL} \
         --conf spark.cores.max=${TOTAL_CORES} \
             --conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" \
             --conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" \
             --class com.yahoo.ml.caffe.tools.LMDB2DataFrame \
             ${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar \
             -imageRoot file:${CAFFE_ON_SPARK}/data/mnist_train_lmdb \
             -lmdb_partitions ${TOTAL_CORES} \
             -outputFormat parquet \
             -output ${CAFFE_ON_SPARK}/data/mnist_train_dataframe

hadoop fs -rm -r -f ${CAFFE_ON_SPARK}/data/mnist_test_dataframe
spark-submit --master ${MASTER_URL} \
         --conf spark.cores.max=${TOTAL_CORES} \
             --conf spark.driver.extraLibraryPath="${LD_LIBRARY_PATH}" \
             --conf spark.executorEnv.LD_LIBRARY_PATH="${LD_LIBRARY_PATH}" \
             --class com.yahoo.ml.caffe.tools.LMDB2DataFrame \
             ${CAFFE_ON_SPARK}/caffe-grid/target/caffe-grid-0.1-SNAPSHOT-jar-with-dependencies.jar \
             -imageRoot file:${CAFFE_ON_SPARK}/data/mnist_test_lmdb \
             -lmdb_partitions ${TOTAL_CORES} \
             -outputFormat parquet \
             -output ${CAFFE_ON_SPARK}/data/mnist_test_dataframe

You could change the MNIST details to CIFAR10 easily.

Thanks, Arun

yahoo / CaffeOnSpark

How can I trans my own data into DF? #232