jocicmarko / kaggle-dsb2-keras

Keras tutorial for Kaggle 2nd Annual Data Science Bowl
176 stars 78 forks source link

Preprocessing speed is fast #2

Open tengpeng opened 8 years ago

tengpeng commented 8 years ago

I am not sure whether or not it is suitable to open a new issue to discuss the speed of Preprocessing.

I notice that the reprocessing speed is so fast. It outperforms around 10x times to my another preprocessing code on the same data. That's impressive. I am curious how do you achieve that. Is there anything you avoid to do in your functions?

The preprocessing stage include what happens in the data.py and train.py.

jocicmarko commented 8 years ago

Since you didn't mention which other pre-processing techniques you currently use, I can't really say why my code outperforms yours :). Basically I use scipy and scikit-image libraries, and I guess it can perform even faster with OpenCV, but I can't offer any official benchmark for that claim.

tengpeng commented 8 years ago

https://github.com/dmlc/mxnet/blob/master/example/kaggle-ndsb2/Preprocessing.py

I guess the function get_data might be the problem causes the performance issue, but I am not sure about that. Maybe need add timer for each function for testing. : )