Open rkdarst opened 5 years ago
Hmm, I am pretty sure it was the case that in the notebook when you copy a cell it does not copy the metadata. Are your students by chance either using JupyterLab (which I think may not implement this feature), or a very old version of the notebook? It's also possible there's been a regression.
In any case, I agree it would be nice if nbgrader handled this more gracefully. Certainly the error message could be better at least. I'll look into options for how to handle this better and report back.
In our provided hub, courses use Notebook by default (5.7.2 it seems we are using now). But Lab is available, and the way to switch between them is documented but not. One previous course also instructed their students to download notebooks and move them to another service for actually doing the assignments - who knows what errors could be inserted in this process!
I'm not an instructor and don't interact with the students, so don't know that much about what students are actually doing and can't ask them questions.
Random thought: are the old values of the cells re-inserted during the fetch
stage or the autograde
stage? Because in both of the cases where I saw this, a cell was duplicated but both of the sources were the same. I wonder if this is a coincidence (they dupilacated cells and didn't edit them) or by design (same cell IDs are updated with the same contents, meaning it really doesn't matter if a duplicate is dropped).
Hmm, I would expect your version of the notebook is new enough (it's a really old version that doesn't support it). Maybe they're using JupyterLab or maybe there's been a regression 😕
Regarding your last question, the answer depends on the cell type I think... for read-only cells (including cells with tests in them) this is the intended behavior in case students manage to edit them somehow, and so I wouldn't be surprised if it just worked with the duplicated cells too. But for other types of cells that have nbgrader ids (e.g. duplicate answer cells) it shouldn't happen that they have the same source, unless the student copied the source.
I experienced this issue as well in the class I recently TA’d (nbgrader 0.5.4 on jupyterhub 0.9.4). My solution was a script that merges cells when it detects duplicate grade_ids by concatenating the cell’s source to the first then removing the duplicate.
I uploaded the script as part of a small collection I wrote for that course in case it’s of use to anyone https://github.com/elesiuta/jupyter-nbgrader-helper (run with --fix AssignName NbName.ipynb). As an aside, if there’s any feature in there you think might be useful to implement properly into nbgrader I’d be happy to open a pull request and work on it.
This also happened when I used the Jupyter notebook split cell editing (CTRL + SHIFT + -
). The new cell created as a result of the split has the same cell-ID (as the original cell). Having the same cell-ID for two cells prevents me from saving the notebook. Annoying!
Likely something to do with this - #758?
I'm having the same problem. You can accidentally copy a read-only cell and then you can't delete it. This causes nbgrader validate
to crash down the road.
jupyPython version 3.7.4 (default, Aug 13 2019, 20:35:49) [GCC 7.3.0] nbgrader version 0.6.0
I can confirm this is an issue, and it sounds like it is a regression in the notebook as this was not a problem previously.
This also causes problems with the feedback system, since Instructor needs to manually fix the notebook, i.e. it's hash changes which is used to distribute the feedback form.
It sounds like this is potentially when the regression happened, at least in the context of splitting cells: https://github.com/jupyter/notebook/pull/2349
I am not really sure what the best way to deal with this is. It sounds like there are other extensions that rely on the metadata being copied, but we need it not to be copied because otherwise we can't tell which cell is the "right" one, and there's no way for us to modify information like the cell id when it is copied. I will have to think about what the right way to deal with this is...
Just adding a +1 to this -- I ran in to this exact issue in my course just now!
Same here; our students can work on their assignments in two setups, so I can't guarantee the exact software versions they used, but it should be pretty similar to:
jupyter --version
jupyter core : 4.7.1
jupyter-notebook : 6.3.0
qtconsole : 5.0.3
ipython : 7.21.0
ipykernel : 5.5.0
jupyter client : 6.1.12
jupyter lab : not installed
nbconvert : 5.6.1
ipywidgets : 7.6.3
nbformat : 5.1.2
traitlets : 4.3.3
Luckily our setup let me access the student's assignments, so I could remove the duplicated cell.
Just as a data point: the issue is still there with JupyterLab 3.6.0 and nbgrader 0.8.1.
I have encountered this issue before and see it again in early 2024. Students use an installation based on the latest miniconda
As in the original post, it arises when students duplicate a solution cell for experimentation and leave it in the submitted version. I would favor a solution where duplicate cells are ignored and the first is used for auto-grading. That would reduce a lot of manual work.
Regards Olav
After being itched by this problem for a long time, I believe that the proper solution would be to guarantee that, when a cell with nbgrader metadata is copied by the student, the metadata of that cell is not copied with it. Achieving this would require just a small Jupyter extension that:
copy_metadata
with default truecopy_metadata
is false, only the cell content gets copied, not the metadata.And a small change in nbgrader to set the attribute copy_metadata on all cells with cell id's.
What do you think?
Are you thinking that the flag is a global toggle, or settable at notebook-level [eg, in the notebook metadata, for example]
As a general idea, yes.... in fact - what does get copied? Are the outputs copied too?
... but I think it turned out that most types of duplications didn't matter that much and could basically be ignored.
My experience is that there are occasionally times where students copy grader cells, and those cells do break the autograder.
It happens sufficiently often that I wrote some notes on fixing it: https://noteable.edina.ac.uk/documentation/notebook-recovery/
I think what @rkdarst may have meant is that beyond beyond breaking the autograder (which just tracks all the encountered grade_id
s and stops if it sees one again), duplicate cells don't cause much harm: if you could skip such cells/notebooks, most things would be fine (at least in our use of nbgrader
). So that is what #1753 does: alert the user if something seems to be duplicated but otherwise skip it and move on with the rest of the autograding.
About setting copy_metadata
at cell-level versus notebook-level: a priori I had in mind to follow the path of the existing 'deletable' and 'editable` flags: e.g. cell-level only. But practice may later suggest to also enable global metadata.
About #1753 : I am really glad someone worked on that issue, and I have been meaning to test it for a long time. That being said, even if it works well in practice, it feels like a workaround for fixing a posteriori what should have never happened in the first place: that nbgrader cells are duplicated with there metadata when the end users really just meant to copy their content.
I totally agree on @nthiery 's suggestion. Remove all metadata for copied cells. That would solve everything.
The thing with duplicating cells: how do we know which should have the original content? first or second? what if content was split among two cells? What if duplicated and the "original" was later deleted, maybe because it was used as a test or something? Or original was modified to be come the "copy"
... but actually, according to my theory, it shouldn't matter! If you can ignore duplicate metadata, hopefully you can ignore these problems too. I would really be interested in an analysis of when it actually matters.
Maybe docs should be updated to warn to add metadata in the least number of possible cases, and try to make it obvious to not duplicate them - even if we do other things?
Remove all metadata for copied cells. That would solve everything.
You may not want to remove all metadata e.g. when creating an assignment, then you would have to set all attributes like the nbgrader's cell type and points from scratch. It is safe to only remove metadata that are causing problems, i.e. the IDs.
Thinking about this more: you can actually create and release an assignment that has duplicate IDs from the start.
It would be nice if nbgrader showed warnings on certain actions like releasing an assignment or submitting a solution, and asked the user to either confirm or abort the action. Along with suggesting steps to manually resolve the issue (e.g. create an empty cell next to the bad cell, copy-pasting the content, and deleting the bad cell), this would help a lot without the need to invent a convoluted solution to prevent/workaround the issue automatically.
You may not want to remove all metadata e.g. when creating an assignment, then you would have to set all attributes like the nbgrader's cell type and points from scratch. It is safe to only remove metadata that are causing problems, i.e. the IDs.
Yes, the idea is not to change how cells behave when editing the assignment sources. Like the currently used read_only
flag the remove_cell_metadata
flag would be set by nbgrader on the student version.
I don't know if it belongs here, but a related issue is when you split a cell then they will also get the same metadata. At least one gets a warning immediately
Duplicate nbgrader cell ID
The nbgrader ID "cell-6ba63f0e4a4f7961" has been used for more than one cell. Please make sure all grade cells have unique ids.
@vahtras: yeah, that's an annoyance too; presumably harder to treat as for copied cell. Luckily, most of our students don't know how to split cells, so that's only an annoyance for instructors when editing assignments.
Hey. This issue with students copying grading cells is the most common fail for auto grading I have.
I think that making these cells not copyable would be the right solution to encourage the correct behaviour.
If they want new cell they should insert them.
This is similar to #981, as in that may solve the problem, but I'll document core problem here and a workaround.
Students have submitted notebooks in which they have copied cells. This duplicates the metadata: of importance here is
"nbgrader": {"grade_id": ...}
. When this happens, the following error message is given:The first warning is the key to the problem. However, the error message shows that nbgrader metadata should be updated, which not the real problem here (though I haven't tested it if actually works). At least, the error message could be improved or the failure could be noticed earlier.
It isn't a perfect solution, but it would be nice if nbgrader could also continue on in this case. Clearly the student has done something weird (and notebook/jupyterlab could solve it by not duplicating metadata). But could nbgrader also survive this and make some choice one way or the other itself, considering that duplicating metadata is probably a type of problem that will appear again? That way the common case won't cause a failure stopping all autograding... and we leave it to instructors to tell of the problem later.
Who knows what the right nbgrader-only solution would be - simple would be to pick the last cell with the metadata. But many corner cases could break this.
Workaround for anyone else who gets this problem: My tests showed that you have to delete the whole
nbgrader
metadata dict from the notebook cell - changing thegrade_id
is not enough because it can't find that grade_id in the database, nor is removing just thegrade_id
cell.Operating system
Linux
nbgrader --version
Modified version, based on upstream d56f75b (close to current master).