swansonk14 / typed-argument-parser

Typed argument parser for Python
MIT License
507 stars 40 forks source link

Add documentation warning about pickled data #73

Closed Cnoor0171 closed 2 years ago

Cnoor0171 commented 2 years ago

TAP offers this great feature about saving and loading all arguments, along with reproducibility info into a json file. Complex python types are encoded and automatically decoded using the pickle module. From the docs,

Note: More complex types will be encoded in JSON as a pickle string.

However, the unpickling of untrusted data is a big security risk, since it is possible to construct malicious pickle data which will execute arbitrary code during unpickling. Novice python programmers might not know about the implications of using pickle, or people might assume only json parsing is taking place, and simply not notice the note about pickle.

I think it would be beneficial to include a prominent warning for users not to run args.load('args.json') on untrusted json files. The warning could either share a short explanation of why its a security risk, or just link to the pickle documentation.

Cnoor0171 commented 2 years ago

Alternatively, the load method could refuse to unpickle data by default unless a keyword argument data_is_trusted is set to true. This would make the warning a lot easier to notice. But it would be backwards incompatible, so perhaps that's overkill.

martinjm97 commented 2 years ago

Hi @Cnoor0171,

Thank you for identifying this security risk. Would you be satisfied by a change to the readme? If so, we're happy to look over a PR and we'll try to get to it soon.

Thanks again, Kyle and Jesse

Cnoor0171 commented 2 years ago

@martinjm97 I think a warning in the readme would suffice. Just something eye catching since the note about pickling kind of easy to miss and requires the user to already know the security risk about unpickling data. Something like this at the top of the readme section for load:

:exclamation: :warning: Never call args.load('args.json') on untrusted files. Argument loading uses the pickle module to decode complex types automatically. Unpickling of untrusted data is a security risk and can lead to arbitrary code execution. See the warning in the pickle docs :exclamation: :warning:

Feel free to merge https://github.com/swansonk14/typed-argument-parser/pull/82 if that sounds fine.

martinjm97 commented 2 years ago

Thank you for keeping a watchful eye out wrt security. Much appreciated!

--Jesse