nextml / NEXT

NEXT is a machine learning system that runs in the cloud and makes it easy to develop, evaluate, and apply active learning in the real-world. Ask better questions. Get better results. Faster. Automated.
http://nextml.org
Apache License 2.0
160 stars 54 forks source link

Running NEXT locally doesn't show images #143

Closed stsievert closed 8 years ago

stsievert commented 8 years ago

When I run NEXT locally and run the commands in the documentation, it all works (which is great!). However, it doesn't display the images when it's running.

I think it'd be useful to refactor launch_experiment.py make it able to run all examples in the examples/ directory as well.

Example:

screen shot 2016-09-19 at 4 38 45 pm

dconathan commented 8 years ago

A workaround I implemented at amfam was to encode/serialize the images as a string and then have the querypage decode it. That was a bit hacky though and we should probably have the local implementation running as similarly to the aws as possible...

stsievert commented 8 years ago

Yeah, NEXT should be agnostic to how you launch the experiment. @dconathan do you want this issue of refactoring the launch_experiment scripts?

lalitkumarj commented 8 years ago

Just to quickly chime in here....you have to host the image here. Just saying "NEXT should be agnostic" is not a particularly good comment. This should be discussed, it's kind of a major decision.

For speed issues, the images should most certainly be hosted on the cloud. For demo purposes if people are running locally, I like @dconathan's solution, as long as we remind people that this could grow their containers (the mongodb) one quickly if they have lots of images. We might even want to explicitly enforce that images have to be under a certain size.

How long does deserializing an image take? We don't really want this to be a bottleneck.

Lalit

On Mon, Sep 19, 2016 at 6:08 PM, Scott Sievert notifications@github.com wrote:

Yeah, NEXT should be agnostic to how you launch the experiment. @dconathan https://github.com/dconathan do you want this issue of refactoring the launch_experiment scripts?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/nextml/NEXT/issues/143#issuecomment-248142581, or mute the thread https://github.com/notifications/unsubscribe-auth/ABWhGEgMbuzkI2eW56t4WbVIBX7Hnt21ks5qrwfAgaJpZM4KBARR .

daniel3735928559 commented 8 years ago

Part of the problem is that "running locally" here really just means doing anything other than running on EC2 using our EC2 setup scripts.

The use-case we were thinking about that led to this discussion (which is one that I understand we have decided, for better or for worse, to attempt to support) is someone trying out NEXT by running it on their laptop to get an idea of how it works, or maybe even testing their own application code.

Possibly @dconathan can speak to any problems with the following: For that user I don't see an issue with having launch_experiment.py have NEXT_BACKEND_GLOBAL_HOST default to localhost and, in that case, instead of having it upload targets to S3, just having it run a SimpleHTTPServer (on the host machine, not in another container, and maybe with a big yellow "WARNING: You are running using the default file server; this should never be done in production").

Of course, for the situation where "localhost" is really a massive cluster and not someone's Macbook, this will be bad. We could take view that "launch_experiment.py is advisory only and you should really write your own stuff to launch your own experiments on your own cluster", or we could add flags to toggle this behaviour.

stsievert commented 8 years ago

We don't have to upload the images; they already live on localhost. We can just change the target's primary_url to be some relative path (or copy the image to a place where we can use a relative path).

This works for demo purposes, Rob's main focus for local/. We can add a note in the docs about speed issues and hosting the images (and then use an optional parameter to switch between the two behaviors).

daniel3735928559 commented 8 years ago

If you mean that they live in the container that is running the server and we can just access them by using NEXT_BACKEND_GLOBAL_HOST:8000/path/to/image.jpg, then since the server is actually a Flask server, this would require that we add a special static route to the NEXT api blueprint for the case where it's being run locally (which seems like something we should not do), or else copy the images to some folder that is currently being used as a static route for some other reason, e.g. /next/query_page/static. This is possible, but it would have to be done in the container rather than on the host (to avoid cluttering up their repo), at which point there has to be code in /next (as opposed to just in launch_experiment.py) to do this copying, which I think I would also not prefer.

If you mean that since the images live on the host computer, we can just do <img src="file:///path/to/image.jpg" />, then it turns out (I was unsure, so I tested just now) that at least Chromium and FF do not actually allow this (Chromium, for example, whinges about "Not allowed to load local resource").

stsievert commented 8 years ago

In the example that's in the repo right now, I was thinking of having <img src="relative/path/to/image.png">. The relative path from query_page.html will probably have to be used, probably ../../../local/data/strangefruit30/i0022.png.

I tried to test on my machine but ran into an issue with the .manifest file defining targets.

dconathan commented 8 years ago

Hmmm, the uploading/hosting thing sounds messy. In my experience, I've never noticed an extra delay in decoding the strings as images. I have some static ipython notebooks where the matplotlib plots are stored this way... a whole page of 10+ plots load pretty instantly.

The only thing that'd be kind of annoying is the query_page template, because it would have to know whether you're grabbing images that are hosted or just embedding encoded images... would we have a "running local"/"decode strings as images" flag, or just have the query_page autodetect if the string is in the right format...?

@lalitkumarj not sure if this is what you mean but encoding/decoding in python is pretty fast at least.

In [36]: import base64
In [37]: def encode_decode(image_file):  
    ...:     with open(image_file, 'rb') as f:
    ...:         encoded = base64.b64encode(f.read())
    ...:     decoded = base64.decodebytes(encoded)
    ...:     return decoded
    ...: 
In [38]: %timeit encode_decode('cat.jpg')
1000 loops, best of 3: 979 µs per loop