yahoo / CaffeOnSpark

Distributed deep learning on Hadoop and Spark clusters.
Apache License 2.0
1.27k stars 358 forks source link

DataLayer use data_param instead of memory_data_param #294

Open ooyanglinoo opened 6 years ago

ooyanglinoo commented 6 years ago

Can I use data_param instead of memory_data_param to define DataSource in DataLayer? If not, how to apply different size images as training dataset?

junshi15 commented 6 years ago

You can only use memory_data_param or cos_data_param to specify the input layers.

The data layers have simple resize function, but I don't think it handles encoded images (say JPEG) with variable sizes.

I would pre-processe the images (resize, rotate, etc) before feeding them to Caffe.

ooyanglinoo commented 6 years ago

I want to train a object detection model based on CaffeOnSpark, whose training image size varies, and may not resize easily because tags on images is based on origin image size.