amplab / velox-modelserver

http://amplab.github.io/velox-modelserver/
Apache License 2.0
110 stars 26 forks source link

Allows saving & loading of observations to a path writable by Spark #70

Closed tomerk closed 9 years ago

tomerk commented 9 years ago

Will close issue #52 Still need to test this, but @dcrankshaw feel free to start reviewing after the keystone integration pull request

dcrankshaw commented 9 years ago

@tomerk Did you have a chance to test this yet?

tomerk commented 9 years ago

Ah not yet I needed the keystone integration to be merged because I was having a weird local etcd issue I couldn't figure out. Will test in a bit

tomerk commented 9 years ago

Okay @dcrankshaw , this now works. Just a note, when https://github.com/amplab/keystone/pull/119#issuecomment-102917144 gets merged velox won't build correctly until I update the newsgroups code, so we may have to be careful on that front (depending on how you do the demo). Or, we could checkout the keystone repo to a specific commit

dcrankshaw commented 9 years ago

Great thanks. Evan did some shuffling around of the package structure in Keystone that broke some stuff, so I figured there might be some issues. I have version of Velox and Keystone that work together right now, and I'll just use that for the demo.

I'll do a code review tomorrow morning.

tomerk commented 9 years ago

Just be careful because if that pull request I linked gets merged before the demo and you're not checking out a specific version of Keystone before building it'll break.

On Sun, May 17, 2015 at 10:32 PM, Dan Crankshaw notifications@github.com wrote:

Great thanks. Evan did some shuffling around of the package structure in Keystone that broke some stuff, so I figured there might be some issues. I have version of Velox and Keystone that work together right now, and I'll just use that for the demo.

I'll do a code review tomorrow morning.

— Reply to this email directly or view it on GitHub https://github.com/amplab/velox-modelserver/pull/70#issuecomment-102927027 .

Tomer Kaftan

etrain commented 9 years ago

+1 for picking a commit hash that works for you!

On Sun, May 17, 2015 at 10:34 PM, Tomer Kaftan notifications@github.com wrote:

Just be careful because if that pull request I linked gets merged before the demo and you're not checking out a specific version of Keystone before building it'll break.

On Sun, May 17, 2015 at 10:32 PM, Dan Crankshaw notifications@github.com wrote:

Great thanks. Evan did some shuffling around of the package structure in Keystone that broke some stuff, so I figured there might be some issues. I have version of Velox and Keystone that work together right now, and I'll just use that for the demo.

I'll do a code review tomorrow morning.

— Reply to this email directly or view it on GitHub < https://github.com/amplab/velox-modelserver/pull/70#issuecomment-102927027

.

Tomer Kaftan

— Reply to this email directly or view it on GitHub https://github.com/amplab/velox-modelserver/pull/70#issuecomment-102927124 .

tomerk commented 9 years ago

Okay @dcrankshaw that pull request got merged into keystone so you should make sure you're still building against a good keystone commit. In the meantime I've updated this pull request to velox to work w/ the newest keystone.

dcrankshaw commented 9 years ago

Can you add support for bulk loading from a CSV file? Once you've done that, can you add a note about loading and saving observations in the Deployment Guide?

Other than that, this looks good.

tomerk commented 9 years ago

Reading and writing a CSV file is a little iffy because we're dealing with templatized types. I'd have to either start using implicits or introduce a "contextToString" or "contextFromString" into the models. Or I may be able to work something out w/ spark SQL data frames but no guarantees

dcrankshaw commented 9 years ago

Okay fair enough. What if it's a JSON list of observations? I just want to be able to bulk load observations that Velox has never seen before.

tomerk commented 9 years ago

I previously tried that too but ran into issues because the json reader isn't serializable and I wasn't sure how to make a new one

— Best Wishes, Tomer Kaftan

On Mon, May 18, 2015 at 3:29 PM, Dan Crankshaw notifications@github.com wrote:

Okay fair enough. What if it's a JSON list of observations? I just want to be able to bulk load observations that Velox has never seen before.

Reply to this email directly or view it on GitHub: https://github.com/amplab/velox-modelserver/pull/70#issuecomment-103234746

dcrankshaw commented 9 years ago

Crap. Okay. I guess for now it's fine to have the client do it on their side and just issue an observe request for each user. Let's leave #52 open though.

Can you add an example to the docs of using load and save?