securefederatedai / openfl

An open framework for Federated Learning.
https://openfl.readthedocs.io/en/latest/index.html
Apache License 2.0
716 stars 194 forks source link

Windows support in OpenFL tutorials #549

Open Maxime-Perret opened 1 year ago

Maxime-Perret commented 1 year ago

Hello,

I've posted this below as discussion but since I've just updated my post, I figured it would fit as an issue now related to supporting Windows.

Discussed in https://github.com/intel/openfl/discussions/546

Originally posted by **Maxime-Perret** October 26, 2022 Hello, I've been experimenting with OpenFL trying to run different tutorials. Since supporting Windows was mentioned in #525, I am personally using Windows 11 Pro and ran into two problems which I believe are OS related in the different tutorials. - For `DataLoader`, having `num_workers` any larger than `0` causes an issue [as such](https://stackoverflow.com/questions/71713719/runtimeerror-dataloader-worker-pids-15876-2756-exited-unexpectedly). Thus, for Windows users, having `num_workers = 0` would be necessary. As far as I know, out of the [interactive api folder](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api), this concerns the tutorials [Dogs vs Cats](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_DogsCats_ViT), [Histology](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_Histology), [Histology FedCurv](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_Histology_FedCurv), [Kvasir UNet](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_Kvasir_UNet), [Lightning MNIST GAN](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_Lightning_MNIST_GAN), [MVTec PatchSVDD](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_MVTec_PatchSVDD), [Market Re-ID](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_Market_Re-ID), [MedMNIST 2D](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_MedMNIST_2D), [MedMNIST 3D](https://github.com/intel/openfl/tree/develop/openfl-tutorials/interactive_api/PyTorch_MedMNIST_3D). - Some of the arguments in the plans of the tutorials (in `workspace/plan/plan.yaml`) contain `!!`, like for example `criterion: !!python/object:torch.nn.modules.loss.CrossEntropyLoss`. When parsing the plan in [`federated/plan/plan.py`](https://github.com/intel/openfl/blob/develop/openfl/federated/plan/plan.py), line 42 uses `safe_load` which together causes an issue [`could not determine a constructor for the tag`](https://death.andgravity.com/yaml-unknown-tag). Using the fix suggested [here](https://github.com/yaml/pyyaml/issues/266), I got it to work by replacing `safe_load(yaml_path.read_text())` by `load(yaml_path.read_text(), Loader=yaml.Loader)`. - Using the `fx pki install` command downloads the binaries for `step` and `step_ca`. The code to download those filters the available assets on github by `content_type` to be `applications/gzip` in `openfl/component/ca/ca.py`, l.74. However, while the Linux assets are `.tar.gz`, the Windows assets are `.zip`, with `content_type` then being `application/zip`. This restriction makes it impossible to find assets for Windows. I tried forcing it to accept but it still fails later on regardless. Best Edit: Updated my post after fixing some of my problems with a clean reinstall of Python+Anaconda
itrushkin commented 1 year ago

Hi @Maxime-Perret, thank you for providing feedback regarding the experience using OpenFL.

This will be fixed in #708.

I have run PyTorch_MedMNIST2D tutorial, which uses the criterion in additional train task keyword arguments. There is no criterion section in plan.yaml. Could you please elaborate more on the use case? Our project requires using safe_load YAML file reading due to the risk involved in loading a document from untrusted input.

  • Using the fx pki install command downloads the binaries for step and step_ca. The code to download those filters the available assets on github by content_type to be applications/gzip in openfl/component/ca/ca.py, l.74. However, while the Linux assets are .tar.gz, the Windows assets are .zip, with content_type then being application/zip. This restriction makes it impossible to find assets for Windows. I tried forcing it to accept but it still fails later on regardless.

Downloading of the step-ca was changed. I personally have tested it on both Linux and Windows systems. See the code.

itrushkin commented 1 year ago

Because of having no response in 2 weeks, taking the issue as stale. Related PR will be merged soon. If there are any additions/comments please feel free to reopen.