trailofbits / fickling

A Python pickling decompiler and static analyzer
GNU Lesser General Public License v3.0
392 stars 44 forks source link

Polyglot module improvements #93

Open suhacker1 opened 7 months ago

suhacker1 commented 7 months ago

To better account for different parser implementations and to make identification more robust, we could use PolyFile and call it in Fickling. In addition, we should ensure the module directly corresponds to the Netron consensus as the present file format descriptions can be more granular. Specifically, the version numbers in the file name are partially dynamically generated in Netron; Fickling’s file format naming convention chooses the minimum possible file format version instead (for instance, a TorchScript v1.6 file in Netron may be deemed a TorchScript v1.4 by Fickling). Any issues with the PyTorch file format versioning system should be taken into consideration.

Current list:

PyTorch v0.1.1: Tar file with sys_info, pickle, storages, and tensors
PyTorch v0.1.10: Stacked pickle files
TorchScript v1.0: ZIP file with model.json
TorchScript v1.1: ZIP file with model.json and attributes.pkl (1 pickle file)
TorchScript v1.3: ZIP file with data.pkl and constants.pkl (2 pickle files)
TorchScript v1.4: ZIP file with data.pkl, constants.pkl, and version set at 2 or higher (2 pickle files)
PyTorch v1.3: ZIP file containing data.pkl (1 pickle file)
PyTorch model archive format [ZIP]: ZIP file that includes Python code files and pickle files