Packj stops :zap: Solarwinds-, ESLint-, and PyTorch-like attacks by flagging malicious/vulnerable open-source dependencies ("weak links") in your software supply-chain
Packj will sometimes generate report data structures that cannot be serialised by the Python standard library's default JSON encoder. More specifically, these data structures will sometimes contain:
Python set instances.
GitPython Commit instances.
In the case of set, there is no equivalent JSON data structure. I have opted for a common workaround: convert the set to a Python list, which the underlying JSON encoder will serialise as a JSON array. The difference between these two data structures is that JSON arrays are ordered whereas Python sets are not. I have opted to sort the intermediary Python list prior to it being serialised. This should at least help provide more deterministic ordering.
In the case of GitPython Commit instances, I just threw together a Python dict-based serialisation containing a collection of important-looking attributes. This serialisation isn't actually being used anywhere, but it could be updated to contain other things in the future.
I've also removed the conditional importing of the Python json module, as json has been a part of Python's standard library for quite some time now. This allowed me to define the custom JSON encoder at load-time instead of runtime, since the custom encoder subclasses the json module's default one.
Fixes #93
Packj will sometimes generate report data structures that cannot be serialised by the Python standard library's default JSON encoder. More specifically, these data structures will sometimes contain:
set
instances.Commit
instances.In the case of
set
, there is no equivalent JSON data structure. I have opted for a common workaround: convert theset
to a Pythonlist
, which the underlying JSON encoder will serialise as a JSON array. The difference between these two data structures is that JSON arrays are ordered whereas Pythonset
s are not. I have opted to sort the intermediary Python list prior to it being serialised. This should at least help provide more deterministic ordering.In the case of GitPython
Commit
instances, I just threw together a Pythondict
-based serialisation containing a collection of important-looking attributes. This serialisation isn't actually being used anywhere, but it could be updated to contain other things in the future.I've also removed the conditional importing of the Python
json
module, asjson
has been a part of Python's standard library for quite some time now. This allowed me to define the custom JSON encoder at load-time instead of runtime, since the custom encoder subclasses thejson
module's default one.