ucrcsedept / galah

An automated grading system geared towards processing computer programming assignments.
Apache License 2.0
42 stars 8 forks source link

Create bare-bones, testing vmfactory. #402

Closed itsjohncs closed 10 years ago

itsjohncs commented 10 years ago

This is a smaller stepping stone towards completing #398. The goal is to create a simple vmfactory that only queries Redis dirty VMs and creates new VMs. Basically I want to implement the pseudo-code in the spec and implement the core functions necessary.

itsjohncs commented 10 years ago

When I resume this I should get the core module to some half-working state so that the vmfactory can hit it appropriately. I'm imagining the core.Connection having instances of the various backend-specific classes within itself and it proxies the commands appropriately. I don't yet have a good idea of how to make it capable of swapping out different backends and it's probably not worth thinking about too hard. Just make sure the interface won't break too hard if we add in the feature to swap out the backends.

I should then figure out how to test the code in a meaningful way. My guess is a simple testing daemon that shoves stuff into Redis and sees if the vmfactory reacts appropriately, possibly with kill -9 commands involved.

itsjohncs commented 10 years ago

Redis should be structured thusly:

itsjohncs commented 10 years ago

Today wasn't as fruitful as I'd like as I was busy with other duties for part of the day, but I was able to make more progress on the core and I also added a patch to MangoEngine to add a feature I wanted to use. Should be able to do some meaningful testing tomorrow which is fun :D.

When I come in tomorrow I should resume work on the redisconnection.py module (which should probably be renamed because it looks like a module that handles retrying disconnections or something like that). I should be able to start poking at it and testing it with a Redis server on my vagrant box pretty soon after starting which I should do. Then I just need to start testing the actual vmfactory executable.

itsjohncs commented 10 years ago

Made the first Redis transactions before I went off to teach a class a couple hours ago. Pretty exciting :grinning:. I'm looking briefly for a tool right now that will let me see the entire state of a Redis server so I can visualize what I'm doing more easily.

itsjohncs commented 10 years ago

Using Redis's MONITOR command I can get a very good idea of what's happening on the Redis server which is plenty fine for now.

itsjohncs commented 10 years ago

Pseudo-code for the grab core function:

# poll every few seconds (set by hint with sensible default) to see if
# the number of clean vms is too low or there's any dirty vms

# pop a vm off (machine_id)_dirty_vms
# set vmfactory_nodes[id].currently_destroying to id of popped vm
# return vm id

# increment (machine_id)_num_clean_vms
# create new vm in (machine_id)_vms marked as clean but being created
# set vmfactory_nodes[id].currently_creating to the new VM id
# return vm id
itsjohncs commented 10 years ago

The core is coming together well. I should be able to finish the second function that the vmfactory uses and begin creating unit tests for the core tomorrow.

itsjohncs commented 10 years ago

I didn't update this last week. I did some good work on Friday and I was in the middle of implementing the full funcitonality of the unregister funciton, which now functions also as a repair function. The idea here was that making the processes have a sticky local ID was going to be very tricky and require coordination from the supervisord process.

So instead I'll make the repair happen when the node is unregistered, this will let me put an unregister in a finally clause so that only extremely fatal errors will prevent the node from fixing itself (kill -9 for example), and a warden component as described in #403 would handle such cases well by unregistering dead nodes.

itsjohncs commented 10 years ago

Did more work on the unregister function today but a large majority of my workday was spent on SI and a phone interview. UCR also wanted a patch to v0.2stable that I'm actually pushing in a moment. I'll also push my bits of work on the unregister function since I don't like to keep very much work on solely my local copy of the repo (for obvious reasons).

Tomorrow I should continue with the unregister function and move onto the other function the vmfactory needs. Hopefully, finally, I can start writing test cases tomorrow. This will also give me a time to revisit the vagrant stuff and update the documentation based on my usage of it.

itsjohncs commented 10 years ago

I'm starting on the test cases now.

I'll want to use py.test definitely. So the first hurdle will be getting a connection to Redis that's reliable. I think the testing code should just take in the Redis server information from the user starting the testing, and it can be up to the user to ensure that the Redis server is properly configured (which will hopefully be some bootstrapping script leveraging vagrant started by an automated testing server at some point). This will allow for maximum portability and prevent huge complexity in the Redis fixture.

itsjohncs commented 10 years ago

Got the first unit test up and running and it's a solid one. Once the testing stuff was finished it was a lot easier to debug problems with the register functions. The RedisMonitor is insanely useful, so much better than having telnet open in the background.

In order to properly test the vmfactory_grab function now, I'll have to implement some methods for adding VMs into the system. Should be straightforward.

Happy happy day :grinning:! Not sure why making unit tests always makes me feel so good but it really does.

itsjohncs commented 10 years ago

Very close now, writing the function responsible for marking the work as finished. Then I'll just make a few more test cases and I can finally create the vmfactory component for real.

I'm not sure of the best way to do high-level testing of the component itself yet. I'd like it to be tied into py.test but I'm not sure how doable that'll be. It might not be entirely feasible to do testing at the component level, though I'd really like it to be. I have the weekend to think about it at least.

itsjohncs commented 10 years ago

Those Lua scripts are getting less and less maintainable, and there's not that many yet. I'm going to try using WATCH and EXEC and such instead where possible.

itsjohncs commented 10 years ago

Finished up a little of what I was doing yesterday. I started implementing the changes hinted at in my last comment and transactions are now used in the unregister function rather than a Lua script. It's not very pretty, but I like it a lot more than the Lua script. I might play around with it a bit more to try and fine a pattern that looks good (right now I'm doing some exception trickery). Hopefully I'll be able to work on it a bit more tomorrow.

itsjohncs commented 10 years ago

I've thought a lot about whether I can improve the "beauty" of the pattern I implemented in the unregister function, and I don't think I'll be able to do anything all that much better. The design of transactions is just kind of a hacky thing (get a failure and retry) so their is at least going to be a bit of nesting. The usage of exceptions to force a return in the calling function won't be used all the time, but as far as Python is concerned it seems like a fairly graceful solution. Exceptions aren't really exceptions in Python, they're just another tool (an example I often use when making this argument is the StopIteration exception).

itsjohncs commented 10 years ago

I think it might be nice to add a decorator to the function that denotes the key to watch. It seems awkward having it be a part of the call to transaction(). I'll make this change if I feel like it later since it's trivial and I'm not sure how much better it will be.

itsjohncs commented 10 years ago

I did some work on Mango Engine yesterday in the hopes of improving the configuration logic. The issues 14 to 16 in the Mango Engine issue tracker should be completed to facilitate this. But as would be expected, the changes are not all that trivial because they involve some design decisions in Mango Engine. I'll try to tackle these this weekend, but for now I'll just use the old configuration code.

itsjohncs commented 10 years ago

I've made a few decisions regarding the design of the vmfactory.

  1. The bootstrapper will be installed onto the VM by the vmfactory and it will site there listening on a socket until a testrunner connects to it. The testrunner will send over all of the files needed through the socket, along with any meta data regarding the test request. This will remove any knowledge of the underlying virtualization provider from the testrunners and make things a little simpler.
  2. The virtualization suites will be called providers instead since that aligns more closely with the vocab used in other projects such as vagrant and packer.
itsjohncs commented 10 years ago

Apparently CTIDs from 1 to 100 are supposed to be reserved for the future per the OpenVZ docs. Good to know...

itsjohncs commented 10 years ago

The vmfactory can now create and destroy containers with the functions implemented in the OpenVZProvider. Now I need to make it capable of preparing the virtual machines. This will involve installing the "bootstrapper" into the vm.

itsjohncs commented 10 years ago

Yesterday was unfortunately packed with other duties, but I was able to get started on the bootstrapper code. I've made some sketches and it looks like the bootstrapper will be a sort of extended state machine. I think the best way to implement it will be to create a number of classes to represent each state, each with an init, run, handle_command, and teardown function. The return values would be other instantiated state objects. I should work on this more tomorrow.

itsjohncs commented 10 years ago

I have written entirely too many network protocols in my life. Why doesn't Python ship with a simple way to frame messages?

itsjohncs commented 10 years ago

The bootstrapper is done as far as I can tell. Now to make the vmfactory install the bootstrapper onto the VMs.

itsjohncs commented 10 years ago

So what I'm noticing is that I haven't yet thought about how to store metadata for the VMs which I would really like to be able to do. We could embed the metadata into the VM's ID but I don't think that'll end up working very well. I'm going to spend some quality time with the spec sheets (including the bootstrapper one) for the rest of today I think.

itsjohncs commented 10 years ago

The metadata I want to store will depend greatly on the type of virtualization which is being used. For the OpenVZ machines, I want to store just two pieces of information:

The CTID is of course very specific to OpenVZ, but I think most virtualization solutions we'd use would expose the VM through an interface we could hit through the IP address. This makes me think to store the IP address specially, then have kind of a "misc metadata" thing.

In reality though, just storing a blob of JSON metadata might work better. I could have a function that returns/sets a dictionary containing all of the meta data for a VM, and that could store the IP address along with any other data. The test runner will be able to ignore any data it doesn't want to use.

itsjohncs commented 10 years ago

I've decided to simply store a blob of JSON metadata. Not positive on the interface to it yet.

itsjohncs commented 10 years ago

Storing blobs of metadata has worked out excellently. I've finished the VZProvider as far as I can tell and it's time for the final layer of all this, the code in the VMFactory to bring it all together :).