Snap! should NOT silently reject bad project XML files

cycomachead commented 9 years ago

Whenever Snap! tries to import a XML file it can't parse -- it does nothing, just drops the file w/ no warning.

I think there should be a warning for any type of file that can't be imported. However, the issue of project XMLs is particularly troublesome because often-enough students will submit Word XML files as their project file.

(That is, they copy and past the XML text from the browser into word, and then save that as an XML.)

Personally, I'd also advocate for writing (err, rather finding one...) a basic parser for Word XML files, because it's easy to extract the data from them. (The same is also true of Safari's .webarchive files which students also submit.)

jmoenig commented 9 years ago

Hmm... there are error messages in the XML parser, and you should actually get them in Snap. But, this aside, can you explain the problem? Are you editing XML files by hand?!!

You know, I'm trying very hard to keep Snap from turning into bloatware, and this is the kind of stuff that leaves me puzzled. What's the use case here, is there a problem I can reproduce?

Gubolin commented 9 years ago

they copy and past the XML text from the browser into word, and then save that as an XML

True.

Personally, I'd also advocate for writing (err, rather finding one...) a basic parser for Word XML files

Don't add a Word parser to Snap!. Instead, let the browser download the file directly. This is where a teacher must explain file formats, Snap! should not need to guess what the user wants.

jmoenig commented 9 years ago

Okay, @cycomachead , if students submit Word files that should'nt be Snap's problem at all, same as submitting a picture of a project shouldn't be Snap's problem. Come on! The right way to submit a project is to publish it on the cloud.

cycomachead commented 9 years ago

This is where a teacher must explain file formats, Snap! should not need to guess what the user wants.

We have written dozens of short guides over the semesters and explained the difference a bunch of times in labs and sections how do export files but it is complicated.

Hmm... there are error messages in the XML parser, and you should actually get them in Snap. But, this aside, can you explain the problem? Are you editing XML files by hand?!!

Yes, I manually nearly every CS10 student submission that has a file type error. Most of them I can recover by hand or fix in easy ways.

However, in the case of Word files, I agree an entire parser/extractor is more work than it's probably worth**. The problem is Word files exported as .XML carry no outside difference to the user as to the format, and Snap! silently drops them. If someone imports a project w/ valid, non- Snap! XML, then I think it would be appropriate for Snap! to say "I'm sorry, the project you are importing is not a valid Snap! file."

The right way to submit a project is to publish it on the cloud.

This additionally has it's own set of caveats and complications, but for grading, verifying a student hasn't modified the file is hard. Plus students still have a hard time getting the shared URLs for projects. (#579)

**On the note about what applications should do -- it's purely personal preference at some point, but my personal feeling is that if you know what a user is trying to do, you should do it even if they're wrong.

bb010g commented 9 years ago

What about an error message with a comment about how they need to make sure that their XML files are just that, and not saved from Word or downloaded using their browser's page save function?

jmoenig commented 9 years ago

So, I'm considering ditching XML altogether (just thinking about it for now, don't get all excited just yet) and switching (back) to a combination of JSON and others. But I just now realize that there might be unanticipated problems with this as you're depending on "files" to be exported and "turned in". Thoughts?

cycomachead commented 9 years ago

So, I'm considering ditching XML altogether (just thinking about it for now, don't get all excited just yet) and switching (back) to a combination of JSON and others. But I just now realize that there might be unanticipated problems with this as you're depending on "files" to be exported and "turned in". Thoughts?

YES!, at least to JSON. What would the "others" be? As far as JSON vs XML I think either format would behave the same in the current way we are dealing with files. It would be conceivable to use JSON as a serialized format and still be exported in the same manner, and it'd have a few advantages over XML (imo).

I don't think you want to know how many times I've created a branch called json and abandoned it. JSON I feel like would be easier to do exciting things for grading projects or more easily working on something like autosave or cross-browser WebRTC communication.

Files, I think we still need because of offline access and because browser storage isn't as reliable, mostly due to size limits. Personally, I've been thinking about, and hacking on at least a tiny bit -- the idea of Zip files, at least for what could be exported. (My only concern is speed). The other benefit of zip files would be that media wouldn't need to be base64 encoded, which would hopefully speed things and keep file sizes more manageable. (The Cloud backend would need to act a bit differently if zip files were sent there.)

Anyway, that's mostly me just playing around with an idea.

As far as turning in files -- what we really need for grading are static submissions that can't be edited once "submitted". This is why files work well, since we're given a copy. Sometimes students can duplicate or "Save As" for a cloud project, but this is also a bit messy because it's very easy for students to accidentally edit the wrong file.

I may get around this by two ways though:

I'm considering implementing OAuth for Snap! that connects to our submission system for Berkeley. The downside is that this process is a bit a slow, and might be difficult to secure.
If we have the new cloud w/ Teacher-Student relationships then a cloud submission to a teacher could optionally duplicate the project on the backend so a teacher could view the state at the time of submission. (This might also happen this semester, at least as a test. I'm taking a class and our group project might be Brian's new Snap! website.)

jmoenig commented 9 years ago

"others" would be a new textual format for Snap scripts that would look and feel like a "regular" programming language, with - maybe - LOGO syntax (or Scheme, or even curly braces, I don't really care). This would also support textual editing of Snap scripts for visually impaired persons, and allow the use of screen readers, text-to-speech utilities and such. But again, at this time all of this is just brainstorming...

bb010g commented 9 years ago

:+1: for JSON. We could start out with a JSON->other converter at first; testing would probably be easier and it would encourage more documentation of Snap!'s export structure.

jmoenig commented 9 years ago

which part of snap's export structure is unclear?

brianharvey commented 9 years ago

Since we're lexically scoped, Scheme makes more sense than Logo. And it'd be easier to parse for a reader for the blind, because a Scheme program is already its own abstract syntax tree.

jmoenig commented 9 years ago

I agree, Brian. Just to be clear for the others: Switching to JSON doesn't have anything to do with making exports more readable by other applications, because for 4.1 the structures are going to become way more intertwined, nested and mishy-mashy.

cycomachead commented 9 years ago

Switching to JSON doesn't have anything to do with making exports more readable by other applications, because for 4.1 the structures are going to become way more intertwined, nested and mishy-mashy.

That may not be the reason, but parsing JSON is a lot easier. As far as the current implementation, I wouldn't say the current serializer is very complex, but because this is native JS, a JSON serializer would only need to implement Prototype.toJSON() for each object type, and return a non-cyclic JS object.

What about 4.1 will make things so much more complex?

brianharvey commented 9 years ago

Objects contain variables and blocks, which may contain lists, which may contain objects, etc. You have arbitrary graphs rather than (mostly) tree structure as in a non-OOP world.

bromagosa commented 9 years ago

+1 for Scheme!

cycomachead commented 9 years ago

You have arbitrary graphs rather than (mostly) tree structure as in a non-OOP world.

But I mean not we can nest sprites in any way, and access properties of sprits, the only thing that really seems missing right now compared to BYOB is the ability to directly interact with a Sprite object. (Which is really just a pointer...) Or is the plan to allow anonymous Sprites?

brianharvey commented 9 years ago

I don't think they'll actually be anonymous, but in other respects they'll be first class. Most importantly, they can be properties of other sprites (or items of lists that are). So it'll be common to have circularities in the graph of things to be serialized.

jmoenig / Snap

Snap! should NOT silently reject bad project XML files #718