Open Gresliebear opened 2 years ago
(venv) PS C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1> python .\FeTS_Challenge.py Creating Workspace Directories Creating Workspace Templates Requirement already satisfied: torchvision in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from -r C:\Users\15702\.local\workspace/requirements.txt (line 1)) (0.9.2+cu111) Requirement already satisfied: torch in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from -r C:\Users\15702\.local\workspace/requirements.txt (line 2)) (1.8.2+cu111) Requirement already satisfied: numpy in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from torchvision->-r C:\Users\15702\.local\workspace/requirements.txt (line 1)) (1.21.0) Requirement already satisfied: pillow>=4.1.1 in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from torchvision->-r C:\Users\15702\.local\workspace/requirements.txt (line 1)) (9.1.1) Requirement already satisfied: typing-extensions in c:\coderepos\moanisandbox\fets-ai\challenge\task_1\venv\lib\site-packages (from torch->-r C:\Users\15702\.local\workspace/requirements.txt (line 2)) (4.2.0) Successfully installed packages from C:\Users\15702\.local\workspace/requirements.txt. New workspace directory structure: workspace ├── .workspace ├── agg_to_col_one_signed_cert.zip ├── agg_to_col_two_signed_cert.zip ├── cert ├── checkpoint ├── data ├── gandlf_paths.csv ├── logs ├── output_validation │ └── 0 ├── partitioning_1.csv ├── partitioning_2.csv ├── plan │ ├── cols.yaml │ ├── data.yaml │ ├── defaults │ └── plan.yaml ├── raid │ └── datasets │ └── FeTS22 ├── requirements.txt ├── save │ └── fets_seg_test_init.pbuf ├── seg_test_train.csv ├── seg_test_val.csv ├── small_split.csv ├── src │ ├── challenge_assigner.py │ ├── fets_challenge_model.py │ ├── __init__.py │ └── __pycache__ │ ├── challenge_assigner.cpython-37.pyc │ ├── fets_challenge_model.cpython-37.pyc │ └── __init__.cpython-37.pyc └── validation.csv 13 directories, 22 files Setting Up Certificate Authority... 1. Create Root CA 1.1 Create Directories 1.2 Create Database 1.3 Create CA Request and Certificate 2. Create Signing Certificate 2.1 Create Directories 2.2 Create Database 2.3 Create Signing Certificate CSR 2.4 Sign Signing Certificate CSR 3 Create Certificate Chain Done. Creating AGGREGATOR certificate key pair with following settings: CN=openvessel.ptd.net, SAN=DNS:openvessel.ptd.net Writing AGGREGATOR certificate key pair to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/server The CSR Hash for file server/agg_openvessel.ptd.net.csr = f713b37863866bd5a82473efd30b8e494ef0243b4470fae2ae40e7d75f5415475f38c91986391d95436bce024df14bf1 Signing AGGREGATOR certificate Creating COLLABORATOR certificate key pair with following settings: CN=one, SAN=DNS:one Moving COLLABORATOR certificate to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/col_one The CSR Hash for file col_one.csr = 58fdc5a503366177f1556335d22295b6d598078341ad3b40ad7301c2cf3dac5252d8feea1f03bb7fa6077b2541562860 Signing COLLABORATOR certificate Registering odeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert\client\col_one in C:\Users\15702\.local\workspace\plan\cols.yaml Creating COLLABORATOR certificate key pair with following settings: CN=two, SAN=DNS:two Moving COLLABORATOR certificate to: C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert/col_two The CSR Hash for file col_two.csr = 374efb23a8b7af15d53eb824db7136e5996b418c38e9b65a12384788aff27fb0c5d59de2418784030bc3196d4342cf27 Signing COLLABORATOR certificate Registering odeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\cert\client\col_two in C:\Users\15702\.local\workspace\plan\cols.yaml C:\Users\15702\.local\workspace\gandlf_paths.csv No 'TrainOrVal' column found in split_subdirs csv, so performing automated split using percent_train of 0.8 [] [] [] [20:09:15] INFO Updating aggregator.settings.rounds_to_train to 5... native.py:102 INFO Updating aggregator.settings.db_store_rounds to 5... native.py:102 WARNING Did not find tasks.train.aggregation_type in config. Make sure it should exist. Creating... native.py:105 INFO Updating task_runner.settings.device to cpu... native.py:102 WARNING Did not find task_runner.settings.fets_config_dict.data_preprocessing in config. Make sure it should exist. Creating... native.py:105 WARNING Did not find task_runner.settings.fets_config_dict.ignore_label_validation in config. Make sure it should exist. Creating... native.py:105 INFO FL-Plan hash is 601cd0b67629af4d8ea0527f65b8a6613cc7d60f28d1a035e5167db87264c20e2fc1f2844d0df0c45d72ae1b29dcff48 plan.py:234 { "aggregator.settings.best_state_path": "save/fets_seg_test_best.pbuf", "aggregator.settings.db_store_rounds": 2, "aggregator.settings.init_state_path": "save/fets_seg_test_init.pbuf", "aggregator.settings.last_state_path": "save/fets_seg_test_last.pbuf", "aggregator.settings.rounds_to_train": 3, "aggregator.settings.write_logs": true, "aggregator.template": "openfl.component.Aggregator", "assigner.settings.training_tasks.0": "aggregated_model_validation", "assigner.settings.training_tasks.1": "train", "assigner.settings.training_tasks.2": "locally_tuned_model_validation", "assigner.settings.validation_tasks.0": "aggregated_model_validation", "assigner.template": "src.challenge_assigner.FeTSChallengeAssigner", "collaborator.settings.db_store_rounds": 1, "collaborator.settings.delta_updates": false, "collaborator.settings.opt_treatment": "RESET", "collaborator.template": "openfl.component.Collaborator", "compression_pipeline.settings": {}, "compression_pipeline.template": "openfl.pipelines.NoCompressionPipeline", "data_loader.settings.feature_shape.0": 32, "data_loader.settings.feature_shape.1": 32, "data_loader.settings.feature_shape.2": 32, "data_loader.template": "openfl.federated.data.loader_fets_challenge.FeTSChallengeDataLoaderWrapper", "network.settings.agg_addr": "openvessel.ptd.net", "network.settings.agg_port": 54937, "network.settings.cert_folder": "cert", "network.settings.client_reconnect_interval": 5, "network.settings.disable_client_auth": false, "network.settings.hash_salt": "auto", "network.settings.tls": true, "network.template": "openfl.federation.Network", "task_runner.settings.device": "cpu", "task_runner.settings.fets_config_dict.batch_size": 1, "task_runner.settings.fets_config_dict.data_augmentation": {}, "task_runner.settings.fets_config_dict.data_postprocessing": {}, "task_runner.settings.fets_config_dict.enable_padding": false, "task_runner.settings.fets_config_dict.in_memory": true, "task_runner.settings.fets_config_dict.inference_mechanism.grid_aggregator_overlap": "crop", "task_runner.settings.fets_config_dict.inference_mechanism.patch_overlap": 0, "task_runner.settings.fets_config_dict.learning_rate": 0.001, "task_runner.settings.fets_config_dict.loss_function": "dc", "task_runner.settings.fets_config_dict.medcam_enabled": false, "task_runner.settings.fets_config_dict.metrics.0": "dice", "task_runner.settings.fets_config_dict.metrics.1": "dice_per_label", "task_runner.settings.fets_config_dict.metrics.2": "hd95_per_label", "task_runner.settings.fets_config_dict.model.amp": true, "task_runner.settings.fets_config_dict.model.architecture": "resunet", "task_runner.settings.fets_config_dict.model.base_filters": 32, "task_runner.settings.fets_config_dict.model.class_list.0": 0, "task_runner.settings.fets_config_dict.model.class_list.1": 1, "task_runner.settings.fets_config_dict.model.class_list.2": 2, "task_runner.settings.fets_config_dict.model.class_list.3": 4, "task_runner.settings.fets_config_dict.model.dimension": 3, "task_runner.settings.fets_config_dict.model.final_layer": "softmax", "task_runner.settings.fets_config_dict.model.norm_type": "instance", "task_runner.settings.fets_config_dict.nested_training.testing": 1, "task_runner.settings.fets_config_dict.nested_training.validation": -5, "task_runner.settings.fets_config_dict.num_epochs": 1, "task_runner.settings.fets_config_dict.optimizer.type": "sgd", "task_runner.settings.fets_config_dict.output_dir": ".", "task_runner.settings.fets_config_dict.parallel_compute_command": "", "task_runner.settings.fets_config_dict.patch_sampler": "label", "task_runner.settings.fets_config_dict.patch_size.0": 64, "task_runner.settings.fets_config_dict.patch_size.1": 64, "task_runner.settings.fets_config_dict.patch_size.2": 64, "task_runner.settings.fets_config_dict.patience": 100, "task_runner.settings.fets_config_dict.pin_memory_dataloader": false, "task_runner.settings.fets_config_dict.print_rgb_label_warning": true, "task_runner.settings.fets_config_dict.q_max_length": 100, "task_runner.settings.fets_config_dict.q_num_workers": 0, "task_runner.settings.fets_config_dict.q_samples_per_volume": 40, "task_runner.settings.fets_config_dict.q_verbose": false, "task_runner.settings.fets_config_dict.save_output": false, "task_runner.settings.fets_config_dict.save_training": false, "task_runner.settings.fets_config_dict.scaling_factor": 1, "task_runner.settings.fets_config_dict.scheduler.type": "triangle_modified", "task_runner.settings.fets_config_dict.track_memory_usage": false, "task_runner.settings.fets_config_dict.verbose": false, "task_runner.settings.fets_config_dict.version.maximum": "0.0.14", "task_runner.settings.fets_config_dict.version.minimum": "0.0.14", "task_runner.settings.fets_config_dict.weighted_loss": true, "task_runner.settings.train_csv": "seg_test_train.csv", "task_runner.settings.val_csv": "seg_test_val.csv", "task_runner.template": "src.fets_challenge_model.FeTSChallengeModel", "tasks.aggregated_model_validation.function": "validate", "tasks.aggregated_model_validation.kwargs.apply": "global", "tasks.aggregated_model_validation.kwargs.metrics.0": "valid_loss", "tasks.aggregated_model_validation.kwargs.metrics.1": "valid_dice", "tasks.locally_tuned_model_validation.function": "validate", "tasks.locally_tuned_model_validation.kwargs.apply": "local", "tasks.locally_tuned_model_validation.kwargs.metrics.0": "valid_loss", "tasks.locally_tuned_model_validation.kwargs.metrics.1": "valid_dice", "tasks.settings": {}, "tasks.train.function": "train", "tasks.train.kwargs.epochs": 1, "tasks.train.kwargs.metrics.0": "loss", "tasks.train.kwargs.metrics.1": "train_dice" } INFO Building 🡆 Object FeTSChallengeDataLoaderWrapper from openfl.federated.data.loader_fets_challenge Module. plan.py:173 INFO Building 🡆 Object FeTSChallengeDataLoaderWrapper from openfl.federated.data.loader_fets_challenge Module. plan.py:173 INFO Building 🡆 Object FeTSChallengeDataLoaderWrapper from openfl.federated.data.loader_fets_challenge Module. plan.py:173 INFO Building 🡆 Object FeTSChallengeModel from src.fets_challenge_model Module. plan.py:173 Constructing queue for train data: 100%|████████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 2.10it/s] Calculating weights Constructing queue for penalty data: 100%|██████████████████████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 2.26it/s] Looping over training data for penalty calculation: 100%|███████████████████████████████████████████████████████████████| 3/3 [00:01<00:00, 1.77it/s] Constructing queue for validation data: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.38it/s] All Keys : ['subject_id', '2', 'spacing', '3', '4', '5', 'label', 'path_to_metadata'] Since Device is CPU, Mixed Precision Training is set to False [20:09:22] INFO Building 🡆 Object FeTSChallengeModel from src.fets_challenge_model Module. plan.py:173 Constructing queue for train data: 100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.71it/s] Calculating weights Constructing queue for penalty data: 100%|██████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.66it/s] Looping over training data for penalty calculation: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.01it/s] Constructing queue for validation data: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.51it/s] All Keys : ['subject_id', '2', 'spacing', '3', '4', '5', 'label', 'path_to_metadata'] Since Device is CPU, Mixed Precision Training is set to False [20:09:25] INFO Building 🡆 Object FeTSChallengeModel from src.fets_challenge_model Module. plan.py:173 Constructing queue for train data: 100%|████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.65it/s] Calculating weights Constructing queue for penalty data: 100%|██████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.86it/s] Looping over training data for penalty calculation: 100%|███████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 2.10it/s] Constructing queue for validation data: 100%|███████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2.80it/s] All Keys : ['subject_id', '2', 'spacing', '3', '4', '5', 'label', 'path_to_metadata'] Since Device is CPU, Mixed Precision Training is set to False Loading pretrained model... [20:09:28] INFO Building 🡆 Object NoCompressionPipeline from openfl.pipelines Module. plan.py:173 [20:09:29] INFO Creating aggregator... experiment.py:323 INFO Building 🡆 Object FeTSChallengeAssigner from src.challenge_assigner Module. plan.py:173 INFO Building 🡆 Object Aggregator from openfl.component Module. plan.py:173 INFO Creating collaborators... experiment.py:330 INFO Building 🡆 Object Collaborator from openfl.component Module. plan.py:173 INFO Building 🡆 Object Collaborator from openfl.component Module. plan.py:173 INFO Building 🡆 Object Collaborator from openfl.component Module. plan.py:173 INFO Starting experiment experiment.py:338 INFO experiment.py:366 Created experiment folder experiment_1... INFO Collaborators chosen to train for round 0: experiment.py:403 ['1', '2', '3'] INFO Hyper-parameters for round 0: experiment.py:425 learning rate: 5e-05 epochs_per_round: 1 INFO Waiting for tasks... collaborator.py:178 INFO Sending tasks to collaborator 3 for round 0 aggregator.py:312 INFO Received the following tasks: ['aggregated_model_validation', 'train', 'locally_tuned_model_validation'] collaborator.py:168 [20:09:30] INFO Using TaskRunner subclassing API collaborator.py:253 ******************** Starting validation : ******************** Looping over validation data: 0%| | 0/1 [00:02<?, ?it/s] Traceback (most recent call last): File ".\FeTS_Challenge.py", line 584, in <module> restore_from_checkpoint_folder = restore_from_checkpoint_folder) File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\fets_challenge\experiment.py", line 468, in run_challenge_experiment collaborators[col].run_simulation() File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\component\collaborator\collaborator.py", line 170, in run_simulation self.do_task(task, round_number) File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\openfl\component\collaborator\collaborator.py", line 259, in do_task **kwargs) File "C:\Users\15702\.local\workspace\src\fets_challenge_model.py", line 48, in validate mode="validation") File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\forward_pass.py", line 276, in validate_network result = step(model, image, label, params, train=True) File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\step.py", line 88, in step loss, metric_output = get_loss_and_metrics(image, label, output, params) File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\loss_and_metric.py", line 141, in get_loss_and_metrics metric_function, predicted, ground_truth, params File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\compute\loss_and_metric.py", line 13, in get_metric_output metric_output = metric_function(predicted, ground_truth, params).detach().cpu() File "C:\CodeRepos\MoaniSandbox\FETS-AI\Challenge\Task_1\venv\lib\site-packages\GANDLF\metrics\segmentation.py", line 42, in multi_class_dice if i != params["model"]["ignore_label_validation"]: KeyError: 'ignore_label_validation'
solution override the plan.yaml as shown below set to false
overrides = { 'aggregator.settings.rounds_to_train': rounds_to_train, 'aggregator.settings.db_store_rounds': db_store_rounds, 'tasks.train.aggregation_type': aggregation_wrapper, 'task_runner.settings.device': device, 'task_runner.settings.fets_config_dict.data_preprocessing': {}, 'task_runner.settings.fets_config_dict.model.ignore_label_validation': False }
solution override the plan.yaml as shown below set to false