jupyter / nbformat

Reference implementation of the Jupyter Notebook format
http://nbformat.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
265 stars 152 forks source link

Start separating normalisation from validation logic #282

Closed Carreau closed 2 years ago

Carreau commented 2 years ago

Re-issue of #236. #244 made things even worse with respect to some problem with notebook trust, where a notebook is trusted, saved, reopened but is not trusted.

The reason being that the signature is computed before validate, and validate mutate the notebook, leading to the signature not matching.

really validate() should not mutate the notebook ever. Validate is in part used for security. It should not return a value of give results on a modified object.

Just try to imagine if a password comparison function, said compare('HUNTER@', 'hunter2') == True because obviously the user had caps lock pressed on their keyboard ? This is the same.

We obviously can't change brutally, but we really need to stop having validation and normalisation be the same step.

Carreau commented 2 years ago

@ewjoachim, you expressed interest in Jupyter and security.

Carreau commented 2 years ago

Hey :) Thanks for the invitation to review :)

Thanks, I was not expecting an in-depth review, but as you pointed out your interest in security and this is problematic because of the signature, validate but mutate, save with mutated signature flow, I thought that would be of interest.

ewjoachim commented 2 years ago

I thought that would be of interest.

It is :) As discussed, I agree with you regarding this aspect of "security boundaries", that we expect some functions to do checks (especially security-related checks, including hashes of content to ensure non-mutability), others to make changes, and it can quickly become problematic when we try to do both at once.

(And I enjoyed every minute of doing this review, don't worry :) )

Carreau commented 2 years ago

Todo: https://github.com/jupyter/nbformat/pull/282#discussion_r903686719 need to be refactor later.

The xfail test, and isvalid() should be updated to pass, and isvalid() should not try to fix anything.

Carreau commented 2 years ago

Ok, let's get that in and see what we break.