amplab / keystone

Simplifying robust end-to-end machine learning on Apache Spark.
http://keystone-ml.org/
Apache License 2.0
469 stars 117 forks source link

Image Scaling #27

Open etrain opened 9 years ago

shivaram commented 9 years ago

So there are a couple of options here -- Do we want to use Bruckner's code ? Or do we want to try to integrate with JMagick ?

etrain commented 9 years ago

IIRC there was a reason you went with JMagick over shell scripts when running experiments, wasn't it non-trivial to integrate with for some reason?

shivaram commented 9 years ago

Well its another JNI library like OpenCV so it has its own so, dylib file etc. that we need to carry around. The shell scripts worked fine for experiments but its not a very clean thing to do. I wish there was a maven for JNI libraries where you could lookup your arch, os and get the right shared library etc.

etrain commented 9 years ago

Hmm... it looks like you can maybe pull this off with per-architecture dependencies (https://nhachicha.wordpress.com/2014/05/27/android-gradle-add-native-so-dependencies/) - but this is kind of messy.

Do we have an idea how bad the pure scala code is in terms of hit to classification error for imnet 2012?

shivaram commented 9 years ago

I don't have the latest numbers but it was 1-2% in the runs from 2-3 weeks back. My problem is not about the benchmark per-se (as we have pre-scaled images saved out for that) but that if we put out an ImageScaler node it should be one of reasonable quality if somebody tries to use it.

Also just noticed that jmagick is LGPL - so it may be tricky to depend on it anyways

etrain commented 9 years ago

Sure - I think the scala code is of reasonable quality - we may want to do a spherical blur instead of a box one, and we definitely want to handle the cases where the image is small correctly (which we currently don't).

I doubt that JMagick is doing any magic (so to speak), and unless integrating with them provides a huge win I guess I'd rather not?

shivaram commented 9 years ago

So I just saw our earlier expt results and running LCS with 1 scale and SIFT with 4 scales (using gaussian blur) only reduces accuracy by 1% or so. So I'm thinking of removing this from the 0.1 release and investigate more options later

etrain commented 9 years ago

Sounds good to me!

On May 15, 2015, at 12:39 AM, Shivaram Venkataraman < notifications@github.com> wrote:

So I just saw our earlier expt results and running LCS with 1 scale and SIFT with 4 scales (using gaussian blur) only reduces accuracy by 1% or so. So I'm thinking of removing this from the 0.1 release and investigate more options later

— Reply to this email directly or view it on GitHub https://github.com/amplab/keystone/issues/27#issuecomment-102300267.