Yelp / MOE

A global, black box optimization engine for real world metric optimization.
Other
1.31k stars 140 forks source link

Persistent Storage of Experiments #435

Open cancan101 opened 9 years ago

cancan101 commented 9 years ago

If I am using the Docker-ized version of MOE, will the the results of experiments be persisted between container runs or is the data kept entirely in memory?

suntzu86 commented 9 years ago

Hi @cancan101, MOE doesn't really persist anything for you. The REST interface is stateless, so if you're using that, you have to save results yourself.

If you've connected to the docker instance and are interacting with MOE directly, you can save state. The library has state which you can hold onto by keeping around the appropriate python objects. You could save those if you want (pickl, json, etc). Or if you're in like ipython (or notebook), if you PAUSE/UNPAUSE docker, your work will persist.

But most likely the answer to your question is "no persistence."

cancan101 commented 9 years ago

To use the python tutorial as an example, let's say I have code that looks like:

exp = Experiment([[0, 2], [0, 4]])

and then I have a number of separate process (perhaps running on other machines) that all want to run:

next_point_to_sample = gp_next_points(exp)[0]
value_of_next_point = function_to_minimize(next_point_to_sample)
exp.historical_data.append_sample_points([SamplePoint(next_point_to_sample, value_of_next_point, 0.01)]) 

How would I do this?

suntzu86 commented 9 years ago

So in that example, gp_next_points is part of our example "simple_endpoint" library. simple_endpoint just produces some wrappers around the REST interface (so you don't have to construct the queries yourself).

If you wanted to use this example, you'd need to pass in the appropriate "rest_host" and "rest_port" (whatever docker tells you), and the outputs are in your python. Those outputs don't live in docker and you can save/use them however you want.

cancan101 commented 9 years ago

Is the entirety of the state kept on the client side here: https://github.com/Yelp/MOE/blob/c802816b180e60ae732239d10f3e7f99ffb078cf/moe/easy_interface/experiment.py#L30 ?

suntzu86 commented 9 years ago

In easy_interface, all the experiment data is grouped inside the "Experiment" object. If you're relying on that to track all of your data, then you need to save it somewhere.

But this is completely orthogonal to Docker.

edit: MOE also has hyperparameters and other components that you can pass in via the REST interface. easy_interface is really just meant as an example.

cancan101 commented 9 years ago

Sure, I agree totally; orthogonal to Docker. I did not realize that all of the state was kept client side.

suntzu86 commented 9 years ago

Cool. Feel free to update our docs if we did a bad job explaining that :/

But yeah at Yelp, all the experiment data existed in other data stores. We had another layer for doing ETL (shared with other experiment stuffs) and calling MOE so we didn't bake any experiment state persistence into MOE.