meteofrance / py4cast

Weather forecasting with Deep Learning
9 stars 10 forks source link

Fix `torch.load` and `torch.save` CWE-502 #85

Open A669015 opened 1 month ago

A669015 commented 1 month ago

Bandit found a CWE-502 related to torch.load and torch.save.

For exemple:

>> Issue: [B614:pytorch_load_save] Use of unsafe PyTorch load or save
   Severity: Medium   Confidence: High
   CWE: CWE-502 (https://cwe.mitre.org/data/definitions/502.html)
   More Info: https://bandit.readthedocs.io/en/1.7.10/plugins/b614_pytorch_load_save.html
   Location: ./py4cast/datasets/dummy.py:297:8
296             buffer = BytesIO()
297             torch.save(d_stats, buffer)
298             buffer.seek(0)

--------------------------------------------------

The workaround in https://bandit.readthedocs.io/en/1.7.10/plugins/b614_pytorch_load_save.html proposes to replace all torch.load and torch.save unsafe usage, by the safetensors.torch.load_file and safetensors.torch.save_file from huggingface (https://huggingface.co/docs/safetensors/en/api/torch).

While safetensors.torch.load can only load data from previoulsly safetensors.torch.save, it will require to generate all .pt files that are loaded in py4cast.

For now, the bandit error has been deactivated in the lint.sh file adding B614 to the list of exceptions.

A669015 commented 1 month ago

Note the weights_only=True option of torch.load and torch.save seems to not be a good workaround, while bandit still found the CWE:

>> Issue: [B614:pytorch_load_save] Use of unsafe PyTorch load or save
   Severity: Medium   Confidence: High
   CWE: CWE-502 (https://cwe.mitre.org/data/definitions/502.html)
   More Info: https://bandit.readthedocs.io/en/1.7.10/plugins/b614_pytorch_load_save.html
   Location: ./py4cast/datasets/base.py:574:21
573         def __post_init__(self):
574             self.stats = torch.load(self.fname, "cpu", weights_only=True)
575

--------------------------------------------------