Closed shaycrk closed 3 years ago
@shaycrk What is the cost of switching from full_load to yaml.load? What you said about representers and constructors makes sense to do to mark it as 'trusted' YAML, but I'm curious what state this PR is in. Does everything work? If so, what makes it worse than full_load?
Thanks @thcrock -- Everything should work here so long as we're fine with moving from full_load
to load
. As I understand it, load
is basically an alias for unsafe_load
and the main cost in making the move is just about security (specifically, that the format can be used to run arbitrary python code, but note that's also the case with full_load
in 5.3, which is why they're recommending the update). On the one hand, the risk seems very low with our use-case since we don't, say, deploy triage as a web-facing app that runs arbitrary user input, though on the other hand it's probably good practice to plug the potential hole eventually in case someone takes triage and uses it in ways we don't expect down the road.
The cost of keeping full_load
is somewhat more acute both in terms of writing the representers/constructors for pyYAML to recognize our custom objects and, probably more importantly, losing the ability to read older/existing matrices. I think given that (and what seems to be low risk), my inclination is towards switching to load
here, but definitely wanted to gut-check that before merging!
Sounds good -- I'll go ahead and merge and create an issue to consider coming back to the load
vs full_load
question later.
Updates PyYAML and mkdocs-material to satisfy safety-ci
However, note that for loading matrix metadata I'm switching from
yaml.full_load(fd)
toyaml.load(fd, Loader=yaml.Loader)
which isn't ideal. Based on the errors I was getting, it seems that in order to stick withfull_load
, we would need to define representers and constructors specifically for theAsOfTimeList
andFeatureNameList
(see the example here). Note that doing so would lose the ability to load matrix metadata from previous versions of triage (though a utility to upgrade old metadata files probably wouldn't be too difficult to write).@thcrock -- would appreciate your thoughts here.