securefederatedai / openfl

An Open Framework for Federated Learning.
https://openfl.readthedocs.io/en/latest/index.html
Apache License 2.0
734 stars 207 forks source link

fix for coverity issue 656283: Unsafe deserialization. #1132

Closed tanwarsh closed 1 week ago

tanwarsh commented 2 weeks ago

While saving and loading models for different collaborators in different round, model is read from file without checking integrity after it was saved in previous round.

Changed the code to save the model with Hash-Based Message Authentication Code (HMAC) and checking its integrity while load the model in next round.

https://coverity.devtools.intel.com/prod4/#/project-view/25300/11305?selectedIssue=656283

Steps to run Privacy Meter example https://openfl.readthedocs.io/en/latest/about/features_index/privacy_meter.html Create a virtual env.

  1. python -m venv venv
  2. cd venv
  3. source bin/activate
  4. Install requirements from openfl/openfl-tutorials/experimental/workflow_interface_requirements.txt
  5. Install requirements from openfl/openfl-tutorials/experimental/Privacy_Meter\requirements_privacy_meter.txt
  6. start the experiment python cifar10_PM.py --audit_dataset_ratio 0.2 --test_dataset_ratio 0.4 --train_dataset_ratio 0.4 --signals loss logits gradient_norm --fpr_tolerance 0.1 0.2 0.3 --log_dir test_sgd --comm_round 30 --optimizer_type SGD --is_feature True --layer_number 10 --flow_internal_loop_test True
Screenshot 2024-11-08 at 4 48 50 PM
tanwarsh commented 2 weeks ago

@tanwarsh I am not in favor of solving this coverity issue given the context. This is tutorial code meant to guide users on how to use privacy meter in the context of OpenFL. Integrity checks and safe pickling are framework practices that should be abstracted away from the user.

Coverity raises an issue because pure pickle/unpickle is not safe. It is not a goal of [this] tutorial. At best, a comment should be added in load and save definitions to say that simple deserialization is for demonstration purposes only, and in practice it is recommended to do integrity checks on any deserialized code.

That said, if there are APIs provided by OpenFL which do not do integrity checks on deserialized code, that will be a good place to start.

I agree with you @MasterSkepticista, but if this tutorial is not moved to open contrib and is part of openFL we will have to address it even if it is just a tutorial.

MasterSkepticista commented 2 weeks ago

Can we not exclude openfl-workspace and openfl-tutorials from coverity?

Alternatively, there could be ways to disable/skip coverity scans on blocks of code. I don't see value in getting a scan to pass at the expense of making a tutorial more sophisticated than it needs to be.

rahulga1 commented 2 weeks ago

@MasterSkepticista let me check this with our PSE.

teoparvanov commented 2 weeks ago

I second @MasterSkepticista's suggestion. At least I believe the priority should be on scanning and clearing the core OpenFL components from Coverity issues. Workspaces could either be skipped, or treated as P2 (non-production code). Let's see what will be the approach suggested by PSE.

Until there's a decision, I suggest we froze this (and similar) PRs addressing Coverity findings on example code and sample FL workspaces.

MasterSkepticista commented 1 week ago

Closing this PR until we have a decision.