google / codeworld

Educational computer programming environment using Haskell
http://code.world
Apache License 2.0
1.24k stars 192 forks source link

Lost project recovery #407

Open cdsmith opened 7 years ago

cdsmith commented 7 years ago

Students often lose their projects, e.g. by saving over them with the wrong project. Would be great if we had a system for letting them try to recover their work. I have done this manually in the past, by searching the source code, which is stored by hash on the server, and sending them links to the project. But it would be better to have a system that works for everyone, so students could use a recovery tool to provide information about their code and get it back.

The risk here is that students might get access to other students' code.

Ideas:

  1. Require that they give enough unique words to isolate the program sufficiently; if there are too many programs, insist on more words.
  2. Ask followups by asking students to identify more words used in the programs, and have enough redundancy that guessing won't work.
  3. Keep IP addresses for code and prefer or restrict to results that geolocate to the same area. (Probably a bad idea)
nixorn commented 5 years ago

Does recovery mean taking deployed code from data/<buildMode>/user ? As I see content of data/<buildMode>/user is all projects without separation by users. Only way (I see) to disallow recovering foreign projects is insert user id into each deployed source. But

  1. search for projects belonging to particular user in grep-like way seems expensive.
  2. more likely it require migrations.

This issue together with #880, #13, #441, #452 making me think about centralized projects db, in which we could store projects with user ids, links to deployed sources per versions, dates of save (to autoclean old files), gallery options etc.

cdsmith commented 5 years ago

No, I don't want to change anything so dramatically.

What I've done at times in the past is, if a student can tell me when they last worked on a program and some distinctive variable names that they used, search through the list of all programs and look for source code that's a match for what they said. This can be done with grep and find The proposal is just to automate this process so that students can try to recover a lost project on their own.

I'm not sure this is a good idea. It's definitely an abuse risk, and could be used by students to fish for code that is not their own. It's also an expensive operation, so it would need to be rate-limited. Still, I hate to have something that's been valuable for me unavailable to others teaching on the platform. Hence this issue.

nixorn commented 5 years ago

So should it be handler which filter input with hardcoded blacklist (to remove words like program, main, max, drawingOf and others which more likely is in every program), run find + grep and return results ?

cdsmith commented 5 years ago

I don't know exactly how it should work. It's pretty low priority, I think. Let's focus on other things first.