A Few Improvements - Githubissues

Hey Andy, a few things pop into my head (low hanging fruit if you will) that would help COG Web a lot.

The first (and a lot easier than you think) is to add real-time streaming of the child process stdout and stderr. This is really easy to do with something like socket.io. Just pipe stdout/stderr into an unguessable UUID channel socket.write(run_id, stdout) you can just use the run UUID for this. Just like you do now, return that UUID to the user when they click run.
Add a JS based parser for output to pull the important stuff out. It makes my eyes bleed looking through thousands of lines of console vomit to find the failing test. Also easier to do than it might seem. This is actually a pretty big one. The C++ testing framework I wrote focuses very heavily on creating super compact and easy to read output.
Keep a run history. If I had to guess, you store all runs forever already, keyed by run UUID and user UUID. It would have been very helpful in my case to have access to those old UUIDs, and thus the ability to pull their results.
Add color escape code parsing and run the child processes in TTY mode to preserve color escapes. This will allow for colored terminal output (again, making it easier to read).

There is a lot more but that's a good start. Please let me know if you have questions, I have a ton of experience doing this stuff from making https://ScorchForge.com. It's not too disparate from your idea, just at a much higher level; it adds a web IDE to the mix, gives each user their own unix shell, and is focused on real-time sharing of code like Google Docs. It's still in very early alpha so there are bugs-a-plenty.

Hi Alec,

Thanks for the thoughts. Here's my take on each:

The first (and a lot easier than you think) is to add real-time streaming of the child process stdout and stderr. This is really easy to do with something like socket.io. Just pipe stdout/stderr into an unguessable UUID channel socket.write(run_id, stdout) you can just use the run UUID for this. Just like you do now, return that UUID to the user when they click run.

This would first require us to change the way the COG core server executes grading scripts. Currently COG doesn't even save the grading script output until the script has completed. Thus, there's no data for the UI to even display until the grading script has finished running. Changing this isn't impossible, but it will require thinking a bit about the grading script execution process and interface. Once that supports tracking of streaming grading script output, we can look at having the web interface display that data. See https://github.com/asayler/COG/blob/master/cogs/testrun.py, https://github.com/asayler/COG/blob/master/cogs/tester_script.py, and https://github.com/asayler/COG/blob/master/cogs/env_local.py for some of the files and abstractions currently involved in that process.

Add a JS based parser for output to pull the important stuff out. It makes my eyes bleed looking through thousands of lines of console vomit to find the failing test. Also easier to do than it might seem. This is actually a pretty big one. The C++ testing framework I wrote focuses very heavily on creating super compact and easy to read output.

A lot of this is more related to the way each course crafts their grading scripts than COG itself. E.g. 3155's use of SBT for its test framework can make reading the output more difficult than using a hand crafted grading script might. Having COG help extract info from the grading script output is non-trivial since COG must support an arbitrary array of grading scripts, each of which may interact with a variety of languages, testing suites, program, etc. E.g. there is no standard COG output format since COG is a general purpose grading system. Thus, processing grading script output data in a structured manner is challenging to do. Some of this has been brought up before in #11, but we've never really settled on an approach or solution. Suggestions welcome.

Keep a run history. If I had to guess, you store all runs forever already, keyed by run UUID and user UUID. It would have been very helpful in my case to have access to those old UUIDs, and thus the ability to pull their results.

This is actually in the works. See #12. Also, the COG CLI already has support for this (although you currently need COG admin rights to really make use of it).

Add color escape code parsing and run the child processes in TTY mode to preserve color escapes. This will allow for colored terminal output (again, making it easier to read).

This has come up before (again, see #11). We could consider trying to pass shell color coding though to the COG Web UI, but, as in your first point, this would likely need to start with backend modifications, not just UI changes. We use the python subprocess library for executing grading scripts, so that's your interface for capturing and passing color codes. That said, I don't think I'm willing to start executing grading scripts in a TTY/shell context, since that opens up a whole slew of security issues I'd rather avoid (shell injection, etc). COG treats all grading scripts as untrusted, and any changes need to a reflect that fact. We might be able to relax some of these requirements a bit once COG adds support for running grading scripts via Docker containers or other isolated environments. but the current local sandboxing isn't a good match for executing untrusted code in a shell context running as the COG user.

My immediate priorities are tying up a few loose ends in the current feature set and getting COG 1.0 out the door. After that, I have some fairly significant backend refactoring I'd like to do (including a move to Python3), and maybe some of this could be addressed then.

As always, however, the fasted way to get your feature requests added is to send me a pull request!

Thanks again, Andy

P.S. It's a bit nicer to break a list of requests like this into a separate issue for each request. That makes it easier to track efforts on one vs the others and to close out specific requests as they're completed. No worries this time, but something to consider for your next feature request list.

It's more of a note than a feature request. I won't be around to see any of this added.

The backend can be simplified from a security standpoint. Spool up a docker container for each run, and register a route to IP in a reverse proxy. For example, the route /cog/zEh6nr2ulLHwsROM maps to the IP 172.xx.xx.xx on a protected subnet. This gives that user direct access to their own Linux instance. Now the socket.io server runs inside the Docker container in a completely isolated environment. The demon that hosts socket.io can bind to what it thinks is port 80 and run whatever the user wants to send it. This implicitly means it runs with the same permissions as the user. All mentioned security risks no longer exist in this context. TTY can be freely used, in fact your GUI can run any arbitrary shell command it likes. In the case of what I did with ScorchForge, it exposes a TTY xterm directly to the user. Doing it that way has a huge list of other benefits as well, the least of which is security. The Docker Swarm backend will let you run the untrusted code on a different physical (or virtual) machine, or cluster of machines, for an added layer of security. This also scratches horizontal scaling (sharding of user code) off the list. Results are reported back over a public (authorized access) REST API. I have yet to find a bullet proof way to 'gurantee' they don't cheat the tests. Their code has access to the running process's memory space after all. I never had a problem with it though.

I don't have any time to devote to anything but ScorchForge.com at this point, I'm sorry. School has already taken a back seat to it :p Best of luck though!

After watching what happened in PL, I would also like to add that Docker Containers have the ability to limit memory usage (with optional per-container page files), disk usage and IO, network IO, among a dozen other things that can be configured. They also have very sophisticated CPU time sharing (both usage, and time-slice duration) with optional core affinity. You can also keep a users container around and reboot it when they run code on the same assignment again, meaning you can do progressive builds. And like I said before, Docker Swarm lets you scale horizontally to infinity, with new swarm nodes registering themselves as the load surpasses the current hardware limits.

You don't need to sell me on Docker - I have plans to move to it at some point. But given my current time constraints, that's unlikely to happen before this summer. As always, pull requests are welcome.

And COG does already limit memory usage, CPU time, etc using rlimits. It's just rougher grain than what could be done in docker.

asayler / COG-Web

A Few Improvements #24