Open afshin opened 2 years ago
I believe this shouldn't be done because it has no effect on the safety of the output
I'm not sure this is true , because I'm not sure in general one should consider only the output as being unsafe for the trust model in general.
While the trust in jupyter lab and jupyter server is used in this way it does not mean nbformat is used in a different way by someone else.
Trust can be for example used to know whether to run a notebook in a cron as root, maybe. And then if the kernel can be changed, why not have it point to an attacker controlled executable. Or have a code that means different things in 2 different kernels.
Loosely related previous discussion: compute_signature
should skip all transient properties not only signature #234
I think that there are multiple things that the current trust mechanism attempts to be:
Should we split those into multiple signatures?
I would also argue that the ecosystem is showing signs of being annoyed with the current trust implementation. Any security mechanism which is sufficiently annoying will be circumvented by users (as the common example of complexity requirements on passwords). Here are some quick examples of this happening in the wild with the notebook trust:
I think that we could introduce a new granular trust system covering the three use cases as described above in backward-compatible way, so the roll-out would be gradual and would not too be costly.
Edit clarifying note: for (1) I think that kernel info should be included because different kernels could lead to different results, so a change here would indicate (depending on use-case) some form of tampering or write/read problem (2) should not include kernel info as it is irrelevant and annoying (3) should include kernel info due to reasons outlined above.
Also linking to a previous discussion on trust in Real Time Collaboration scenario, where per-output/widget trust issue was discussed: https://github.com/jupyterlab/jupyterlab/pull/11494.
For me most of these are either bugs or conflation of "save" with "export". Currently the ipynb is becoming both a persistent store for application state and an exchange format.
IMHO the jupyter server should store in whatever binary format that is incompatible between version somewhere, and over option to "export", or potentially auto-export version of the files in .md, .ipynb or whatever you like.
This is how most applications work today when you have complex structure. Especially with RTC that needs complex informations there is no reason to try to shove things into the ipynb.
@divyansshhh opened an issue in JupyterLab that is more appropriate for
jupyter/nbformat
, so this issue is meant to replace the original in JupyterLab.