protectai / modelscan

Protection against Model Serialization Attacks
http://modelscan.ai
Apache License 2.0
310 stars 67 forks source link

Issues handling zip files in ModelScan._iterate_models() #215

Open mmaitre314 opened 2 weeks ago

mmaitre314 commented 2 weeks ago

Describe the bug

Couple of issues hit in ModelScan._iterate_models() while testing with zip files:

In HF model https://huggingface.co/hugginglearners/fastai-style-transfer/tree/main, model.pkl is a zip file in spite of its pkl extension. The code seems to want to skip that file because "supported_zip_extensions": [".zip", ".npz"], but the if statement has an and instead of or so it ends up scanning the unsupported extension. Maybe the typo is not a bug but a feature and supported_zip_extensions should be removed? model.pkl is a zip file with an actual Pickle file in it, so worth scanning.

        if (
            not _is_zipfile(file, model.get_stream())
            and Path(file).suffix
            not in self._settings["supported_zip_extensions"]
        ):
            continue

Happy to send a PR if helpful.

To Reproduce Steps to reproduce the behavior:

Expected behavior Scans complete.

Screenshots .

Environment (please complete the following information):

Additional context .

seanpmorgan commented 1 week ago

Thanks for the find @mmaitre314 ! If you could submit a PR we'd be happy to review!