autolab / Autolab

Course management service that enables auto-graded programming assignments.
http://www.autolabproject.com/
Apache License 2.0
752 stars 214 forks source link

Issue with parsing the scoreboard hash #209

Closed robsimmons closed 9 years ago

robsimmons commented 9 years ago

The scoreboard hash returned for 122's imagelab is huge, it includes an encoded PNG file. Right now this is failing, seemingly because Autolab isn't correctly parsing the escaped \"

https://autolab.cs.cmu.edu/courses/14/assessments/219/viewFeedback?feedback=484&submission=21657

robsimmons commented 9 years ago

Oh wow: and while the hash is not being parsed correctly for grading, it IS being parsed correctly for the scoreboard... but the scoreboard is sanitizing it

https://autolab.cs.cmu.edu/courses/14/assessments/219/scoreboard :cry: :cry: :cry:

Is anything to be done? Is this the end of imagelab? :crying_cat_face:

dlbucci commented 9 years ago

So for comment 1, it looks like "data" is a key in your JSON, but its value doesn't have a starting quote/double quote. Could that be the issue?

For question 2, I think sanitization is the sane behavior, but we might be able to build in an image feature, because that is a cool use of the scoreboard (even though you're killing our system!).

robsimmons commented 9 years ago

"data" is not a key in my JSON: it's a part of the ridiculous image tag, which is itself the second item in the JSON array. That whole line is being built by json.dumps in Python, so it's hopefully well-formatted.

Let me know if there are any ideas on resolving this temporarily or permanently. Sanitization isn't unreasonable, certainly not in the long term, but there are also probably extremely well-debugged libraries for sanitation in any language people are using to write autograders. I guess except maybe for Standard ML/150/210.

dlbucci commented 9 years ago

Right, I just realized the data thing. The thing is that I don't think we want to rely on instructors/lab authors to be sanitizing autograder output. Without Rails handling the escaping, if one lab wasn't sanitizing output, we would be vulnerable to XSS attacks. I'm not sure I see this being safe, unless rails injects the base64 string into the image tag. I don't think there's a difference between a sanitized and unsanitized base64 string, so this would work for legal images and not insert illegal strings.

robsimmons commented 9 years ago

It's easy enough for me to put just the base64 string in the hash; do you have a sense of what needs to happen on your end to make the rest happen?

dlbucci commented 9 years ago

Yeah, I can add a scoreboard setting that will insert a column's value into the src of an img tag and this should work, right?

robsimmons commented 9 years ago

That'll work, though it might be worth poking around as to whether there are possible attacks there, avoiding the possibility of where {0} ends up being

"/> drop tables etc etc...

A thing that would concievably be easier to validate on your end is to accept the raw png:base64 encoded input, so that your code would look like

or whatever.

On Wed, Jan 28, 2015 at 8:40 PM, Daniel Bucci notifications@github.com wrote:

Yeah, I can add a scoreboard setting that will insert a column's value into the src of an img tag and this should work, right?

— Reply to this email directly or view it on GitHub https://github.com/autolab/Autolab/issues/209#issuecomment-71953838.

Robert J. Simmons simrob.com rjsimmon@cs.cmu.edu robsimmons@gmail.com Cell: 404-273-6890

dlbucci commented 9 years ago

Yeah, I wasn't sure about the initial characters specifying file types and whatnot, but I know that base64 characters don't include any HTML escapees. So if Rails escapes the return value and there's an attack in it, it'll just be bad data for the image tag, and shouldn't be parsed as HTML.

dlbucci commented 9 years ago

So let's do this: pass back just the base64 string, then in your scoreboard JSON, add an "img":true to the column object. I'll update the scoreboards to take this setting being true as meaning render the column like

Sound good?

robsimmons commented 9 years ago

It'll have to be tomorrow, but if you set up from your end I'll make it happen early tomorrow. On Jan 28, 2015 8:57 PM, "Daniel Bucci" notifications@github.com wrote:

So let's do this: pass back just the base64 string, then in your scoreboard JSON, add an "img":true to the column object. I'll update the scoreboards to take this setting being true as meaning render the column like

Sound good?

— Reply to this email directly or view it on GitHub https://github.com/autolab/Autolab/issues/209#issuecomment-71955522.

robsimmons commented 9 years ago

The images assignment for Pittsburgh now dumps only a base64-encoded PNG, nothing else, into the hash. Can't do the thing you asked me to because I get

Colspec unknown key('img') in scoreboard[1]

dlbucci commented 9 years ago

I just pushed a fix to allow that, and updated your scoreboard schema, and it appears to work!

robsimmons commented 9 years ago

Beautiful! I suppose there's still a parsing issue and it doesn't make sense to close the issue, but that helps us a lot.

dlbucci commented 9 years ago

The feedback is still getting a parsing error? It looks okay when I regrade.

robsimmons commented 9 years ago

Oh, our autograder is fine now. But back in the beginning we were giving valid output from the autograder (I am pretty sure) and autolab wasn't handling it correctly.

dlbucci commented 9 years ago

I'm gonna table this issue under the assumption that if it's really an issue (and we haven't heard about anything like this recently, so I think it's not), we'll hear about it again.