Project-MONAI / model-zoo

MONAI Model Zoo that hosts models in the MONAI Bundle format.
Apache License 2.0
179 stars 67 forks source link

Unable to train hovernet on windows machine #438

Closed rajeshtims closed 1 year ago

rajeshtims commented 1 year ago

I am trying to train hovernet_nuclei on Windows 10 conda environment with Python 3.9. I can train other models fine, but NOT hovernet.

I tried turning off the multigpu checkbox; but it didn't help. Even when the multigpu option is turned off, the INFO message shows: 2023-05-25 21:58:35,016] [26476] [MainThread] [INFO] (monailabel.tasks.train.bundle:210) - Train Request: {'model': 'hovernet_nuclei', 'multi_gpu': False, 'gpus': 'all', 'bundle_path': 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification', 'local_rank': 0, 'force_multi_gpu': True}

I can get annotations fine, saving them is also working; patches are extracted when I initiate training, but as soon as the patches get extracted, I am getting ."failed to create process". message. It returns code 0 and just says training finished

I am using monailabel 0.7.0rc7; I tried recreating the environment; tried with redownloading the sample app. No luck so far.

Here is the entire error message:

(monailabel) C:\Users\raj>monailabel start_server --app C:/Users/raj/apps/pathology --studies C:/Users/raj/datasets/ --conf models hovernet_nuclei --limit_concurrency 10 Using PYTHONPATH=C:\Users\raj\Anaconda3\envs;C:\Users\raj\Anaconda3\envs;C:\Users\raj\Anaconda3 "" [2023-05-25 13:06:54,299] [11928] [MainThread] [INFO] (main:285) - USING:: version = False [2023-05-25 13:06:54,299] [11928] [MainThread] [INFO] (main:285) - USING:: app = C:\Users\raj\apps\pathology [2023-05-25 13:06:54,300] [11928] [MainThread] [INFO] (main:285) - USING:: studies = C:\Users\raj\datasets [2023-05-25 13:06:54,302] [11928] [MainThread] [INFO] (main:285) - USING:: verbose = INFO [2023-05-25 13:06:54,302] [11928] [MainThread] [INFO] (main:285) - USING:: conf = [['models', 'hovernet_nuclei']] [2023-05-25 13:06:54,302] [11928] [MainThread] [INFO] (main:285) - USING:: host = 0.0.0.0 [2023-05-25 13:06:54,303] [11928] [MainThread] [INFO] (main:285) - USING:: port = 8000 [2023-05-25 13:06:54,303] [11928] [MainThread] [INFO] (main:285) - USING:: uvicorn_app = monailabel.app:app [2023-05-25 13:06:54,303] [11928] [MainThread] [INFO] (main:285) - USING:: ssl_keyfile = None [2023-05-25 13:06:54,304] [11928] [MainThread] [INFO] (main:285) - USING:: ssl_certfile = None [2023-05-25 13:06:54,304] [11928] [MainThread] [INFO] (main:285) - USING:: ssl_keyfile_password = None [2023-05-25 13:06:54,304] [11928] [MainThread] [INFO] (main:285) - USING:: ssl_ca_certs = None [2023-05-25 13:06:54,305] [11928] [MainThread] [INFO] (main:285) - USING:: workers = None [2023-05-25 13:06:54,305] [11928] [MainThread] [INFO] (main:285) - USING:: limit_concurrency = 10 [2023-05-25 13:06:54,305] [11928] [MainThread] [INFO] (main:285) - USING:: access_log = False [2023-05-25 13:06:54,306] [11928] [MainThread] [INFO] (main:285) - USING:: root_path = / [2023-05-25 13:06:54,306] [11928] [MainThread] [INFO] (main:285) - USING:: log_level = info [2023-05-25 13:06:54,306] [11928] [MainThread] [INFO] (main:285) - USING:: log_config = None [2023-05-25 13:06:54,307] [11928] [MainThread] [INFO] (main:285) - USING:: dryrun = False [2023-05-25 13:06:54,307] [11928] [MainThread] [INFO] (main:285) - USING:: action = start_server [2023-05-25 13:06:54,307] [11928] [MainThread] [INFO] (main:296) - Allow Origins: ['*'] [2023-05-25 13:06:55,237] [11928] [MainThread] [INFO] (uvicorn.error:74) - Started server process [11928] [2023-05-25 13:06:55,238] [11928] [MainThread] [INFO] (uvicorn.error:48) - Waiting for application startup. [2023-05-25 13:06:55,239] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.app:37) - Initializing App from: C:\Users\raj\apps\pathology; studies: C:\Users\raj\datasets; conf: {'models': 'hovernet_nuclei'} [2023-05-25 13:06:55,318] [11928] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for MONAILabelApp Found: <class 'main.MyApp'> [2023-05-25 13:06:55,342] [11928] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.classification_nuclei.ClassificationNuclei'> [2023-05-25 13:06:55,345] [11928] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.hovernet_nuclei.HovernetNuclei'> [2023-05-25 13:06:55,347] [11928] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.nuclick.NuClick'> [2023-05-25 13:06:55,349] [11928] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation_nuclei.SegmentationNuclei'> [2023-05-25 13:06:55,350] [11928] [MainThread] [INFO] (main:83) - +++ Adding Model: hovernet_nuclei => lib.configs.hovernet_nuclei.HovernetNuclei 2023-05-25 13:06:55,350 - INFO - --- input summary of monai.bundle.scripts.download --- 2023-05-25 13:06:55,351 - INFO - > name: 'pathology_nuclei_segmentation_classification' 2023-05-25 13:06:55,351 - INFO - > version: '0.1.8' 2023-05-25 13:06:55,352 - INFO - > bundle_dir: 'C:\Users\raj\apps\pathology\model' 2023-05-25 13:06:55,352 - INFO - > source: 'github' 2023-05-25 13:06:55,354 - INFO - > removeprefix: 'monai' 2023-05-25 13:06:55,354 - INFO - > progress: True 2023-05-25 13:06:55,356 - INFO - ---

pathology_nuclei_segmentation_classification_v0.1.8.zip: 267MB [00:04, 57.9MB/s] 2023-05-25 13:07:00,206 - INFO - Downloaded: C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification_v0.1.8.zip 2023-05-25 13:07:00,206 - INFO - Expected md5 is None, skip md5 check for file C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification_v0.1.8.zip. 2023-05-25 13:07:00,207 - INFO - Writing into directory: C:\Users\raj\apps\pathology\model. [2023-05-25 13:07:01,748] [11928] [MainThread] [INFO] (main:87) - +++ Using Models: ['hovernet_nuclei'] [2023-05-25 13:07:01,748] [11928] [MainThread] [INFO] (monailabel.interfaces.app:135) - Init Datastore for: C:\Users\raj\datasets [2023-05-25 13:07:01,750] [11928] [MainThread] [INFO] (monailabel.datastore.local:130) - Auto Reload: True; Extensions: ['.nii.gz', '.nii', '.nrrd', '.jpg', '.png', '.tif', '.svs', '.xml'] [2023-05-25 13:07:01,766] [11928] [MainThread] [INFO] (monailabel.datastore.local:577) - Invalidate count: 0 [2023-05-25 13:07:01,766] [11928] [MainThread] [INFO] (monailabel.datastore.local:151) - Start observing external modifications on datastore (AUTO RELOAD) [2023-05-25 13:07:02,293] [11928] [MainThread] [INFO] (main:129) - +++ Adding Inferer:: hovernet_nuclei => <lib.infers.hovernet_nuclei.HovernetNuclei object at 0x000000003BFC4B50> [2023-05-25 13:07:02,300] [11928] [MainThread] [INFO] (main:153) - +++ Adding Trainer:: hovernet_nuclei => <lib.trainers.hovernet_nuclei.HovernetNuclei object at 0x000000003BFB8730> [2023-05-25 13:07:02,300] [11928] [MainThread] [INFO] (main:174) - Active Learning Strategies:: ['wsi_random'] [2023-05-25 13:07:02,301] [11928] [MainThread] [INFO] (monailabel.utils.sessions:51) - Session Path: C:\Users\raj.cache\monailabel\sessions [2023-05-25 13:07:02,301] [11928] [MainThread] [INFO] (monailabel.utils.sessions:52) - Session Expiry (max): 3600 [2023-05-25 13:07:02,302] [11928] [MainThread] [INFO] (monailabel.interfaces.app:469) - App Init - completed [2023-05-25 13:07:02,302] [timeloop] [INFO] Starting Timeloop.. [2023-05-25 13:07:02,302] [11928] [MainThread] [INFO] (timeloop:60) - Starting Timeloop.. [2023-05-25 13:07:02,303] [timeloop] [INFO] Registered job <function MONAILabelApp.on_init_complete..run_scheduler at 0x000000003BF97AF0> [2023-05-25 13:07:02,303] [11928] [MainThread] [INFO] (timeloop:42) - Registered job <function MONAILabelApp.on_init_complete..run_scheduler at 0x000000003BF97AF0> [2023-05-25 13:07:02,305] [timeloop] [INFO] Timeloop now started. Jobs will run based on the interval set [2023-05-25 13:07:02,305] [11928] [MainThread] [INFO] (timeloop:63) - Timeloop now started. Jobs will run based on the interval set [2023-05-25 13:07:02,307] [11928] [MainThread] [INFO] (uvicorn.error:62) - Application startup complete. [2023-05-25 13:07:02,308] [11928] [MainThread] [INFO] (uvicorn.error:217) - Uvicorn running on http://0.0.0.0:8000/ (Press CTRL+C to quit) [2023-05-25 13:13:21,763] [11928] [MainThread] [INFO] (monailabel.endpoints.activelearning:44) - Active Learning Request: {'strategy': 'wsi_random', 'image': '', 'patch_size': [1024, 1024], 'image_size': [0, 0]} [2023-05-25 13:13:21,764] [11928] [MainThread] [INFO] (monailabel.tasks.activelearning.random:47) - Random: Selected Image: JP2K-33003-1; Weight: 809 [2023-05-25 13:13:21,768] [11928] [MainThread] [INFO] (monailabel.endpoints.activelearning:60) - Next sample: {'id': 'JP2K-33003-1', 'weight': 809, 'path': 'C:\Users\raj\datasets\JP2K-33003-1.svs', 'ts': 1685033742, 'name': 'JP2K-33003-1.svs', 'strategy': {'wsi_random': {'ts': 1685045601, 'client_id': 'admin'}}} [2023-05-25 13:14:22,399] [11928] [MainThread] [INFO] (monailabel.endpoints.wsi_infer:109) - WSI Infer Request: {'model': 'hovernet_nuclei', 'image': 'JP2K-33003-1', 'output': 'asap', 'level': 0, 'location': [2340, 5175], 'size': [119, 118], 'tile_size': [1024, 1024], 'min_poly_area': 30, 'foreground': [], 'background': [], 'max_workers': 1} [2023-05-25 13:14:22,420] [11928] [MainThread] [INFO] (monailabel.tasks.infer.basic_infer:277) - Infer Request (final): {'device': 'cuda:0', 'model_filename': ['model.pt'], 'label_colors': {'Other': (255, 0, 0), 'Inflammatory': (255, 255, 0), 'Epithelial': (0, 0, 255), 'Spindle-Shaped': (0, 255, 0)}, 'model': 'hovernet_nuclei', 'image': 'C:\Users\raj\datasets\JP2K-33003-1.svs', 'output': 'asap', 'level': 0, 'location': (2340, 5175), 'size': (119, 118), 'tile_size': [1024, 1024], 'min_poly_area': 30, 'foreground': [], 'background': [], 'max_workers': 1, 'id': 0, 'logging': 'INFO', 'result_write_to_file': False, 'description': 'A simultaneous segmentation and classification of nuclei within multitissue histology images based on CoNSeP data', 'save_label': False} monai.transforms.io.dictionary LoadImaged.init:image_only: Current default value of argument image_only=False has been deprecated since version 1.1. It will be changed to image_only=True in version 1.3. [2023-05-25 13:14:22,430] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:76) - PRE - Run Transform(s) [2023-05-25 13:14:22,430] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:77) - PRE - Input Keys: ['device', 'model_filename', 'label_colors', 'model', 'image', 'output', 'level', 'location', 'size', 'tile_size', 'min_poly_area', 'foreground', 'background', 'max_workers', 'id', 'logging', 'result_write_to_file', 'description', 'save_label', 'image_path'] [2023-05-25 13:14:22,564] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - PRE - Transform (LoadImagePatchd): Time: 0.1334; image: torch.Size([119, 118, 3])(torch.uint8) [2023-05-25 13:14:22,565] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - PRE - Transform (EnsureChannelFirstd): Time: 0.0; image: torch.Size([3, 119, 118])(torch.uint8) [2023-05-25 13:14:23,736] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - PRE - Transform (CastToTyped): Time: 1.1701; image: torch.Size([3, 119, 118])(torch.float32) [2023-05-25 13:14:23,737] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - PRE - Transform (ScaleIntensityRanged): Time: 0.001; image: torch.Size([3, 119, 118])(torch.float32) [2023-05-25 13:14:23,740] [11928] [MainThread] [INFO] (monailabel.tasks.infer.basic_infer:480) - Inferer:: cuda:0 => SlidingWindowHoVerNetInferer => {'roi_size': 256, 'sw_batch_size': 16, 'overlap': 0.359375, 'mode': constant, 'sigma_scale': 0.125, 'padding_mode': 'constant', 'cval': 0, 'sw_device': None, 'device': None, 'progress': True, 'cpu_thresh': None, 'buffer_steps': None, 'buffer_dim': -1, 'roi_weight_map': None, 'extra_input_padding': (46, 46, 46, 46)} [2023-05-25 13:14:23,740] [11928] [MainThread] [INFO] (monailabel.tasks.infer.basic_infer:418) - Infer model path: C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\models\model.pt [2023-05-25 13:14:23,741] [11928] [MainThread] [INFO] (monailabel.tasks.infer.basic_infer:426) - Using provided model_file: C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\models\model.pt 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.58s/it] monai.transforms.io.dictionary SaveImaged.init:resample: Current default value of argument resample=True has been deprecated since version 1.1. It will be changed to resample=False in version 1.3. [2023-05-25 13:14:25,780] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:76) - POST - Run Transform(s) [2023-05-25 13:14:25,780] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:77) - POST - Input Keys: ['device', 'model_filename', 'label_colors', 'model', 'image', 'output', 'level', 'location', 'size', 'tile_size', 'min_poly_area', 'foreground', 'background', 'max_workers', 'id', 'logging', 'result_write_to_file', 'description', 'save_label', 'image_path', 'image_meta_dict', 'latencies', 'pred'] [2023-05-25 13:14:25,781] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (FlattenSubKeysd): Time: 0.0; image: torch.Size([3, 119, 118])(torch.float32); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) [2023-05-25 13:14:25,811] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (HoVerNetInstanceMapPostProcessingd): Time: 0.0289; image: torch.Size([3, 119, 118])(torch.float32); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) [2023-05-25 13:14:25,824] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (HoVerNetNuclearTypePostProcessingd): Time: 0.011; image: torch.Size([3, 119, 118])(torch.float32); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) [2023-05-25 13:14:25,824] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (RenameKeyd): Time: 0.0; image: torch.Size([3, 119, 118])(torch.float32); pred: torch.Size([1, 119, 118])(torch.int64); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) [2023-05-25 13:14:25,825] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (SqueezeDimd): Time: 0.0; image: torch.Size([3, 119, 118])(torch.float32); pred: torch.Size([119, 118])(torch.int64); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) Any labeled images will be returned as a boolean array. Did you mean to use a boolean array? Only one label was provided to remove_small_objects. Did you mean to use a boolean array? [2023-05-25 13:14:25,828] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (PostFilterLabeld): Time: 0.002; image: torch.Size([3, 119, 118])(torch.float32); pred: (119, 118)(int64); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) [2023-05-25 13:14:25,830] [11928] [MainThread] [INFO] (monailabel.interfaces.utils.transform:122) - POST - Transform (FindContoursd): Time: 0.001; image: torch.Size([3, 119, 118])(torch.float32); pred: (119, 118)(int64); horizontal_vertical: torch.Size([2, 119, 118])(torch.float32); nucleus_prediction: torch.Size([2, 119, 118])(torch.float32); type_prediction: torch.Size([5, 119, 118])(torch.float32) [2023-05-25 13:14:25,831] [11928] [MainThread] [INFO] (monailabel.transform.writer:291) - +++ Output Type: asap [2023-05-25 13:14:25,832] [11928] [MainThread] [INFO] (monailabel.tasks.infer.basic_infer:332) - ++ Latencies => Total: 3.4133; Pre: 1.3175; Inferer: 2.0330; Invert: 0.0000; Post: 0.0607; Write: 0.0010 [2023-05-25 13:14:25,832] [11928] [MainThread] [INFO] (monailabel.tasks.infer.basic_infer:357) - Result Json Keys: ['annotation', 'label_names', 'latencies'] [2023-05-25 13:14:25,833] [11928] [MainThread] [INFO] (monailabel.interfaces.app:692) - JP2K-33003-1 => 0 => cuda:0 => 1 / 1; Latencies: {'pre': 1.32, 'infer': 2.03, 'invert': 0.0, 'post': 0.06, 'write': 0.0, 'total': 3.41, 'transform': {'pre': {'LoadImagePatchd': 0.1334, 'EnsureChannelFirstd': 0.0, 'CastToTyped': 1.1701, 'ScaleIntensityRanged': 0.001}, 'post': {'FlattenSubKeysd': 0.0, 'HoVerNetInstanceMapPostProcessingd': 0.0289, 'HoVerNetNuclearTypePostProcessingd': 0.011, 'RenameKeyd': 0.0, 'SqueezeDimd': 0.0, 'PostFilterLabeld': 0.002, 'FindContoursd': 0.001}}} [2023-05-25 13:14:25,834] [11928] [MainThread] [INFO] (monailabel.interfaces.app:721) - +++ Generating ASAP XML Annotation [2023-05-25 13:14:25,835] [11928] [MainThread] [INFO] (monailabel.utils.others.pathology:125) - Annotation keys: dict_keys(['location', 'size', 'elements', 'labels']) [2023-05-25 13:14:25,835] [11928] [MainThread] [INFO] (monailabel.utils.others.pathology:132) - Adding Contours for label: Inflammatory; color: #ffff00; color_map: {'Inflammatory': (255, 255, 0), 'Spindle-Shaped': (0, 255, 0)} [2023-05-25 13:14:25,836] [11928] [MainThread] [INFO] (monailabel.utils.others.pathology:132) - Adding Contours for label: Spindle-Shaped; color: #00ff00; color_map: {'Inflammatory': (255, 255, 0), 'Spindle-Shaped': (0, 255, 0)} [2023-05-25 13:14:25,837] [11928] [MainThread] [INFO] (monailabel.utils.others.pathology:154) - Total Annotations: 7 [2023-05-25 13:15:38,438] [11928] [MainThread] [INFO] (monailabel.endpoints.datastore:68) - Image: JP2K-33003-1-patch-2340_5175_119_118; File: <starlette.datastructures.UploadFile object at 0x00000000468810A0>; params: {} [2023-05-25 13:15:38,439] [11928] [MainThread] [INFO] (monailabel.datastore.local:439) - Adding Image: JP2K-33003-1-patch-2340_5175_119_118 => C:\Users\raj\AppData\Local\Temp\tmpixdrlydf.png [2023-05-25 13:15:38,477] [11928] [MainThread] [INFO] (monailabel.endpoints.datastore:101) - Saving Label for JP2K-33003-1-patch-2340_5175_119_118 for tag: final by admin [2023-05-25 13:15:38,478] [11928] [MainThread] [INFO] (monailabel.endpoints.datastore:112) - Save Label params: {} [2023-05-25 13:15:38,478] [11928] [MainThread] [INFO] (monailabel.datastore.local:486) - Saving Label for Image: JP2K-33003-1-patch-2340_5175_119_118; Tag: final; Info: {} [2023-05-25 13:15:38,479] [11928] [MainThread] [INFO] (monailabel.datastore.local:494) - Adding Label: JP2K-33003-1-patch-2340_5175_119_118 => final => C:\Users\raj\AppData\Local\Temp\tmpq57i05iu.xml [2023-05-25 13:15:38,483] [11928] [MainThread] [INFO] (monailabel.datastore.local:510) - Label Info: {'ts': 1685045738, 'name': 'JP2K-33003-1-patch-2340_5175_119_118.xml'} [2023-05-25 13:15:38,485] [11928] [MainThread] [INFO] (monailabel.interfaces.app:493) - New label saved for: JP2K-33003-1-patch-2340_5175_119_118 => JP2K-33003-1-patch-2340_5175_119_118 [2023-05-25 13:15:38,510] [11928] [Thread-2] [INFO] (monailabel.datastore.local:577) - Invalidate count: 0 [2023-05-25 13:15:38,526] [11928] [Thread-2] [INFO] (monailabel.datastore.local:577) - Invalidate count: 0 [2023-05-25 13:16:07,822] [11928] [MainThread] [INFO] (monailabel.utils.async_tasks.task:41) - Train request: {'model': 'hovernet_nuclei', 'run_id': 'train', 'tracking_experiment_name': 'hov_seg', 'tracking_uri': 'http://127.0.0.1/', 'bundle_path': 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification'} [2023-05-25 13:16:07,823] [11928] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:49) - Before:: C:\Users\raj\Anaconda3\envs;C:\Users\raj\Anaconda3\envs;C:\Users\raj\Anaconda3;C:\Users\raj\apps\pathology [2023-05-25 13:16:07,824] [11928] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:53) - After:: C:\Users\raj\Anaconda3\envs;C:\Users\raj\Anaconda3\envs;C:\Users\raj\Anaconda3;C:\Users\raj\apps\pathology;C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification [2023-05-25 13:16:07,825] [11928] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:65) - COMMAND:: C:\Users\raj\Anaconda3\envs\monailabel\python.exe -m monailabel.interfaces.utils.app -m train -r {"model":"hovernet_nuclei","run_id":"train","tracking_experiment_name":"hov_seg","tracking_uri":"http://127.0.0.1","bundle_path":"C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification","gpus":"all"} [2023-05-25 13:16:08,323] [17280] [MainThread] [INFO] (main:37) - Initializing App from: C:\Users\raj\apps\pathology; studies: C:\Users\raj\datasets; conf: {'models': 'hovernet_nuclei'} [2023-05-25 13:16:12,279] [17280] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for MONAILabelApp Found: <class 'main.MyApp'> [2023-05-25 13:16:12,288] [17280] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.classification_nuclei.ClassificationNuclei'> [2023-05-25 13:16:12,288] [17280] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.hovernet_nuclei.HovernetNuclei'> [2023-05-25 13:16:12,289] [17280] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.nuclick.NuClick'> [2023-05-25 13:16:12,290] [17280] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation_nuclei.SegmentationNuclei'> [2023-05-25 13:16:12,290] [17280] [MainThread] [INFO] (main:83) - +++ Adding Model: hovernet_nuclei => lib.configs.hovernet_nuclei.HovernetNuclei [2023-05-25 13:16:12,290] [17280] [MainThread] [INFO] (main:87) - +++ Using Models: ['hovernet_nuclei'] [2023-05-25 13:16:12,290] [17280] [MainThread] [INFO] (monailabel.interfaces.app:135) - Init Datastore for: C:\Users\raj\datasets [2023-05-25 13:16:12,290] [17280] [MainThread] [INFO] (monailabel.datastore.local:130) - Auto Reload: False; Extensions: ['.nii.gz', '.nii', '.nrrd', '.jpg', '.png', '.tif', '.svs', '.xml'] [2023-05-25 13:16:12,307] [17280] [MainThread] [INFO] (monailabel.datastore.local:577) - Invalidate count: 0 [2023-05-25 13:16:12,807] [17280] [MainThread] [INFO] (main:129) - +++ Adding Inferer:: hovernet_nuclei => <lib.infers.hovernet_nuclei.HovernetNuclei object at 0x000000003A0E4B80> [2023-05-25 13:16:12,808] [17280] [MainThread] [INFO] (main:153) - +++ Adding Trainer:: hovernet_nuclei => <lib.trainers.hovernet_nuclei.HovernetNuclei object at 0x000000003A0E4D00> [2023-05-25 13:16:12,808] [17280] [MainThread] [INFO] (main:174) - Active Learning Strategies:: ['wsi_random'] [2023-05-25 13:16:12,808] [17280] [MainThread] [INFO] (monailabel.utils.sessions:51) - Session Path: C:\Users\raj.cache\monailabel\sessions [2023-05-25 13:16:12,809] [17280] [MainThread] [INFO] (monailabel.utils.sessions:52) - Session Expiry (max): 3600 [2023-05-25 13:16:12,809] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:210) - Train Request: {'model': 'hovernet_nuclei', 'run_id': 'train', 'tracking_experiment_name': 'hov_seg', 'tracking_uri': 'http://127.0.0.1/', 'bundle_path': 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification', 'gpus': 'all', 'local_rank': 0, 'force_multi_gpu': True} [2023-05-25 13:16:12,813] [17280] [MainThread] [INFO] (lib.utils:90) - Split data based on tile size: (1024, 1024); groups: None 0%| | 0/16 [00:00<?, ?it/s][2023-05-25 13:16:12,815] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:12,815] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-39588_34378_1380_1380.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-39588_34378_1380_1380.xml'} [2023-05-25 13:16:12,819] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 1727 [2023-05-25 13:16:12,819] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-39588_34378_1380_1380 => Groups: dict_keys(['other', 'inflammatory', 'epithelial']); Location: (349, 308); Size: 774 x 768 [2023-05-25 13:16:12,864] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (1380, 1380, 3); Total Patches to save: 4 [2023-05-25 13:16:13,289] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 8; c: 1; unique: (array([0, 1], dtype=uint8), array([1901426, 2974], dtype=int64)) [2023-05-25 13:16:13,310] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([1901247, 3153], dtype=int64)) [2023-05-25 13:16:13,332] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 35; c: 1; unique: (array([0, 1], dtype=uint8), array([1881580, 22820], dtype=int64)) [2023-05-25 13:16:13,332] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (1380, 1380); Total Patches to save: 4 6%|6 | 1/16 [00:00<00:08, 1.81it/s][2023-05-25 13:16:13,366] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,367] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-42406_35461_546_546.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-42406_35461_546_546.xml'} [2023-05-25 13:16:13,369] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 1191 [2023-05-25 13:16:13,369] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-42406_35461_546_546 => Groups: dict_keys(['other', 'inflammatory', 'epithelial']); Location: (3, 2); Size: 522 x 523 [2023-05-25 13:16:13,376] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (546, 546, 3); Total Patches to save: 1 [2023-05-25 13:16:13,446] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 6; c: 1; unique: (array([0, 1], dtype=uint8), array([295577, 2539], dtype=int64)) [2023-05-25 13:16:13,449] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([295359, 2757], dtype=int64)) [2023-05-25 13:16:13,452] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 15; c: 1; unique: (array([0, 1], dtype=uint8), array([282091, 16025], dtype=int64)) [2023-05-25 13:16:13,453] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (546, 546); Total Patches to save: 1 [2023-05-25 13:16:13,462] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,462] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-43363_36028_376_543.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-43363_36028_376_543.xml'} [2023-05-25 13:16:13,463] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 438 [2023-05-25 13:16:13,464] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-43363_36028_376_543 => Groups: dict_keys(['other', 'epithelial']); Location: (2, 5); Size: 371 x 501 [2023-05-25 13:16:13,468] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (543, 376, 3); Total Patches to save: 1 [2023-05-25 13:16:13,520] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 5; c: 1; unique: (array([0, 1], dtype=uint8), array([202321, 1847], dtype=int64)) [2023-05-25 13:16:13,523] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 7; c: 1; unique: (array([0, 1], dtype=uint8), array([198400, 5768], dtype=int64)) [2023-05-25 13:16:13,523] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (543, 376); Total Patches to save: 1 19%|https://github.com/Project-MONAI/MONAILabel/issues/8 | 3/16 [00:00<00:02, 4.90it/s][2023-05-25 13:16:13,531] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,532] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-42457_36044_822_512.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-42457_36044_822_512.xml'} [2023-05-25 13:16:13,534] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 888 [2023-05-25 13:16:13,534] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-42457_36044_822_512 => Groups: dict_keys(['other', 'epithelial']); Location: (54, 7); Size: 759 x 505 [2023-05-25 13:16:13,543] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (512, 822, 3); Total Patches to save: 1 [2023-05-25 13:16:13,642] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 8; c: 1; unique: (array([0, 1], dtype=uint8), array([417898, 2966], dtype=int64)) [2023-05-25 13:16:13,647] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 14; c: 1; unique: (array([0, 1], dtype=uint8), array([407784, 13080], dtype=int64)) [2023-05-25 13:16:13,647] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (512, 822); Total Patches to save: 1 25%|#https://github.com/Project-MONAI/MONAILabel/issues/5 | 4/16 [00:00<00:02, 5.66it/s][2023-05-25 13:16:13,657] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,657] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-41780_36342_657_725.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-41780_36342_657_725.xml'} [2023-05-25 13:16:13,658] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 791 [2023-05-25 13:16:13,659] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-41780_36342_657_725 => Groups: dict_keys(['other', 'epithelial']); Location: (22, 8); Size: 633 x 715 [2023-05-25 13:16:13,670] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (725, 657, 3); Total Patches to save: 1 [2023-05-25 13:16:13,770] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 4; c: 1; unique: (array([0, 1], dtype=uint8), array([474725, 1600], dtype=int64)) [2023-05-25 13:16:13,774] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 12; c: 1; unique: (array([0, 1], dtype=uint8), array([463974, 12351], dtype=int64)) [2023-05-25 13:16:13,775] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (725, 657); Total Patches to save: 1 31%|##https://github.com/Project-MONAI/MONAILabel/issues/1 | 5/16 [00:00<00:01, 6.22it/s][2023-05-25 13:16:13,784] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,784] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-17954_22620_177_177.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-17954_22620_177_177.xml'} [2023-05-25 13:16:13,787] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 1318 [2023-05-25 13:16:13,787] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-17954_22620_177_177 => Groups: dict_keys(['other', 'epithelial']); Location: (9, 4); Size: 168 x 172 [2023-05-25 13:16:13,789] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (177, 177, 3); Total Patches to save: 1 [2023-05-25 13:16:13,818] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 32; c: 1; unique: (array([0, 1], dtype=uint8), array([20422, 10907], dtype=int64)) [2023-05-25 13:16:13,818] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 3; c: 1; unique: (array([0, 1], dtype=uint8), array([19500, 11829], dtype=int64)) [2023-05-25 13:16:13,819] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (177, 177); Total Patches to save: 1 [2023-05-25 13:16:13,830] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,831] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-18098_22466_144_144.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-18098_22466_144_144.xml'} [2023-05-25 13:16:13,832] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 320 [2023-05-25 13:16:13,832] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-18098_22466_144_144 => Groups: dict_keys(['other', 'epithelial']); Location: (3, 6); Size: 139 x 135 [2023-05-25 13:16:13,833] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (144, 144, 3); Total Patches to save: 1 [2023-05-25 13:16:13,862] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 9; c: 1; unique: (array([0, 1], dtype=uint8), array([18510, 2226], dtype=int64)) [2023-05-25 13:16:13,862] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([17769, 2967], dtype=int64)) [2023-05-25 13:16:13,862] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (144, 144); Total Patches to save: 1 [2023-05-25 13:16:13,871] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,871] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-17917_22463_181_152.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-17917_22463_181_152.xml'} [2023-05-25 13:16:13,873] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 1049 [2023-05-25 13:16:13,873] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-17917_22463_181_152 => Groups: dict_keys(['other', 'epithelial']); Location: (1, 3); Size: 178 x 146 [2023-05-25 13:16:13,874] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (152, 181, 3); Total Patches to save: 1 [2023-05-25 13:16:13,902] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 27; c: 1; unique: (array([0, 1], dtype=uint8), array([18778, 8734], dtype=int64)) [2023-05-25 13:16:13,902] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 2; c: 1; unique: (array([0, 1], dtype=uint8), array([17899, 9613], dtype=int64)) [2023-05-25 13:16:13,902] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (152, 181); Total Patches to save: 1 50%|##### | 8/16 [00:01<00:00, 10.97it/s][2023-05-25 13:16:13,911] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,911] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-17678_22673_182_182.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-17678_22673_182_182.xml'} [2023-05-25 13:16:13,913] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 651 [2023-05-25 13:16:13,913] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-17678_22673_182_182 => Groups: dict_keys(['other', 'epithelial']); Location: (1, 3); Size: 178 x 179 [2023-05-25 13:16:13,914] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (182, 182, 3); Total Patches to save: 1 [2023-05-25 13:16:13,942] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 14; c: 1; unique: (array([0, 1], dtype=uint8), array([28476, 4648], dtype=int64)) [2023-05-25 13:16:13,943] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 5; c: 1; unique: (array([0, 1], dtype=uint8), array([27260, 5864], dtype=int64)) [2023-05-25 13:16:13,943] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (182, 182); Total Patches to save: 1 [2023-05-25 13:16:13,951] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:13,951] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6V4V-PL22-patch-16909_22665_284_284.png', 'label': 'C:\Users\raj\datasets\labels\final\6V4V-PL22-patch-16909_22665_284_284.xml'} [2023-05-25 13:16:13,956] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 2628 [2023-05-25 13:16:13,957] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6V4V-PL22-patch-16909_22665_284_284 => Groups: dict_keys(['other', 'epithelial']); Location: (1, 2); Size: 283 x 277 [2023-05-25 13:16:13,958] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (284, 284, 3); Total Patches to save: 1 [2023-05-25 13:16:13,995] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 52; c: 1; unique: (array([0, 1], dtype=uint8), array([61814, 18842], dtype=int64)) [2023-05-25 13:16:13,996] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 13; c: 1; unique: (array([0, 1], dtype=uint8), array([54470, 26186], dtype=int64)) [2023-05-25 13:16:13,996] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (284, 284); Total Patches to save: 1 [2023-05-25 13:16:14,007] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:14,007] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6UZ7-PL22-patch-26615_32544_778_778.png', 'label': 'C:\Users\raj\datasets\labels\final\6UZ7-PL22-patch-26615_32544_778_778.xml'} [2023-05-25 13:16:14,015] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 4273 [2023-05-25 13:16:14,016] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6UZ7-PL22-patch-26615_32544_778_778 => Groups: dict_keys(['inflammatory', 'epithelial', 'spindle-shaped']); Location: (2, 2); Size: 776 x 773 [2023-05-25 13:16:14,031] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (778, 778, 3); Total Patches to save: 1 [2023-05-25 13:16:14,188] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 4; c: 1; unique: (array([0, 1], dtype=uint8), array([604348, 936], dtype=int64)) [2023-05-25 13:16:14,196] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 58; c: 1; unique: (array([0, 1], dtype=uint8), array([552758, 52526], dtype=int64)) [2023-05-25 13:16:14,202] [17280] [MainThread] [INFO] (lib.utils:593) - spindle-shaped => p: 29; c: 1; unique: (array([0, 1], dtype=uint8), array([541258, 64026], dtype=int64)) [2023-05-25 13:16:14,202] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (778, 778); Total Patches to save: 1 69%|#####https://github.com/Project-MONAI/MONAILabel/issues/8 | 11/16 [00:01<00:00, 10.41it/s][2023-05-25 13:16:14,216] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:14,216] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6UZ7-PL22-patch-28591_32657_887_887.png', 'label': 'C:\Users\raj\datasets\labels\final\6UZ7-PL22-patch-28591_32657_887_887.xml'} [2023-05-25 13:16:14,334] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 5964 [2023-05-25 13:16:14,336] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6UZ7-PL22-patch-28591_32657_887_887 => Groups: dict_keys(['other', 'inflammatory', 'epithelial', 'spindle-shaped']); Location: (2, 2); Size: 885 x 885 [2023-05-25 13:16:14,355] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (887, 887, 3); Total Patches to save: 1 [2023-05-25 13:16:14,562] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 4; c: 1; unique: (array([0, 1], dtype=uint8), array([785904, 865], dtype=int64)) [2023-05-25 13:16:14,570] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 5; c: 1; unique: (array([0, 1], dtype=uint8), array([784619, 2150], dtype=int64)) [2023-05-25 13:16:14,580] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 90; c: 1; unique: (array([0, 1], dtype=uint8), array([712140, 74629], dtype=int64)) [2023-05-25 13:16:14,589] [17280] [MainThread] [INFO] (lib.utils:593) - spindle-shaped => p: 40; c: 1; unique: (array([0, 1], dtype=uint8), array([696642, 90127], dtype=int64)) [2023-05-25 13:16:14,590] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (887, 887); Total Patches to save: 1 [2023-05-25 13:16:14,606] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:14,606] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\6UZ7-PL22-patch-26941_32692_351_351.png', 'label': 'C:\Users\raj\datasets\labels\final\6UZ7-PL22-patch-26941_32692_351_351.xml'} [2023-05-25 13:16:14,608] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 909 [2023-05-25 13:16:14,608] [17280] [MainThread] [INFO] (lib.utils:562) - ID: 6UZ7-PL22-patch-26941_32692_351_351 => Groups: dict_keys(['inflammatory', 'epithelial', 'spindle-shaped']); Location: (8, 3); Size: 306 x 348 [2023-05-25 13:16:14,611] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (351, 351, 3); Total Patches to save: 1 [2023-05-25 13:16:14,654] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([123013, 188], dtype=int64)) [2023-05-25 13:16:14,656] [17280] [MainThread] [INFO] (lib.utils:593) - epithelial => p: 15; c: 1; unique: (array([0, 1], dtype=uint8), array([109535, 13666], dtype=int64)) [2023-05-25 13:16:14,657] [17280] [MainThread] [INFO] (lib.utils:593) - spindle-shaped => p: 5; c: 1; unique: (array([0, 1], dtype=uint8), array([108302, 14899], dtype=int64)) [2023-05-25 13:16:14,658] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (351, 351); Total Patches to save: 1 81%|#######https://github.com/Project-MONAI/MONAILabel/issues/1 | 13/16 [00:01<00:00, 7.39it/s][2023-05-25 13:16:14,667] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:14,667] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\JP2K-33003-1-patch-2319_5116_152_133.png', 'label': 'C:\Users\raj\datasets\labels\final\JP2K-33003-1-patch-2319_5116_152_133.xml'} [2023-05-25 13:16:14,668] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 265 [2023-05-25 13:16:14,668] [17280] [MainThread] [INFO] (lib.utils:562) - ID: JP2K-33003-1-patch-2319_5116_152_133 => Groups: dict_keys(['other', 'inflammatory', 'spindle-shaped']); Location: (16, 10); Size: 124 x 117 [2023-05-25 13:16:14,669] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (133, 152, 3); Total Patches to save: 1 [2023-05-25 13:16:14,694] [17280] [MainThread] [INFO] (lib.utils:593) - other => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([20083, 133], dtype=int64)) [2023-05-25 13:16:14,695] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 10; c: 1; unique: (array([0, 1], dtype=uint8), array([18515, 1701], dtype=int64)) [2023-05-25 13:16:14,695] [17280] [MainThread] [INFO] (lib.utils:593) - spindle-shaped => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([18402, 1814], dtype=int64)) [2023-05-25 13:16:14,695] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (133, 152); Total Patches to save: 1 [2023-05-25 13:16:14,703] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:14,703] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\JP2K-33003-1-patch-2257_5110_206_206.png', 'label': 'C:\Users\raj\datasets\labels\final\JP2K-33003-1-patch-2257_5110_206_206.xml'} [2023-05-25 13:16:14,705] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 638 [2023-05-25 13:16:14,705] [17280] [MainThread] [INFO] (lib.utils:562) - ID: JP2K-33003-1-patch-2257_5110_206_206 => Groups: dict_keys(['inflammatory', 'spindle-shaped']); Location: (6, 12); Size: 195 x 168 [2023-05-25 13:16:14,706] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (206, 206, 3); Total Patches to save: 1 [2023-05-25 13:16:14,738] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 22; c: 1; unique: (array([0, 1], dtype=uint8), array([38980, 3456], dtype=int64)) [2023-05-25 13:16:14,739] [17280] [MainThread] [INFO] (lib.utils:593) - spindle-shaped => p: 5; c: 1; unique: (array([0, 1], dtype=uint8), array([38540, 3896], dtype=int64)) [2023-05-25 13:16:14,739] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (206, 206); Total Patches to save: 1 [2023-05-25 13:16:14,748] [17280] [MainThread] [INFO] (lib.utils:550) - ++ Using Groups: {} [2023-05-25 13:16:14,748] [17280] [MainThread] [INFO] (lib.utils:552) - Fetching Image/Label : {'image': 'C:\Users\raj\datasets\JP2K-33003-1-patch-2340_5175_119_118.png', 'label': 'C:\Users\raj\datasets\labels\final\JP2K-33003-1-patch-2340_5175_119_118.xml'} [2023-05-25 13:16:14,749] [17280] [MainThread] [INFO] (lib.utils:560) - Total Points: 134 [2023-05-25 13:16:14,749] [17280] [MainThread] [INFO] (lib.utils:562) - ID: JP2K-33003-1-patch-2340_5175_119_118 => Groups: dict_keys(['inflammatory', 'spindle-shaped']); Location: (12, 36); Size: 89 x 78 [2023-05-25 13:16:14,750] [17280] [MainThread] [INFO] (lib.utils:616) - Image => Input: (118, 119, 3); Total Patches to save: 1 [2023-05-25 13:16:14,778] [17280] [MainThread] [INFO] (lib.utils:593) - inflammatory => p: 5; c: 1; unique: (array([0, 1], dtype=uint8), array([13198, 844], dtype=int64)) [2023-05-25 13:16:14,779] [17280] [MainThread] [INFO] (lib.utils:593) - spindle-shaped => p: 1; c: 1; unique: (array([0, 1], dtype=uint8), array([13127, 915], dtype=int64)) [2023-05-25 13:16:14,779] [17280] [MainThread] [INFO] (lib.utils:616) - Label => Input: (118, 119); Total Patches to save: 1 100%|##########| 16/16 [00:01<00:00, 10.24it/s] 100%|##########| 16/16 [00:01<00:00, 8.11it/s] [2023-05-25 13:16:14,787] [17280] [MainThread] [INFO] (lib.utils:105) - +++ Total Records: 19 [2023-05-25 13:16:14,787] [17280] [MainThread] [INFO] (lib.trainers.hovernet_nuclei:57) - Split data (len: 19) based on each nuclei 0%| | 0/19 [00:00<?, ?it/s] 5%|5 | 1/19 [00:02<00:50, 2.79s/it] 11%|# | 2/19 [00:05<00:47, 2.77s/it] 16%|https://github.com/Project-MONAI/MONAILabel/issues/5 | 3/19 [00:08<00:44, 2.77s/it] 21%|#https://github.com/Project-MONAI/MONAILabel/issues/1 | 4/19 [00:11<00:42, 2.80s/it] 26%|#https://github.com/Project-MONAI/MONAILabel/issues/6 | 5/19 [00:13<00:39, 2.79s/it] 32%|##https://github.com/Project-MONAI/MONAILabel/issues/1 | 6/19 [00:16<00:36, 2.80s/it] 37%|##https://github.com/Project-MONAI/MONAILabel/issues/6 | 7/19 [00:19<00:33, 2.79s/it] 42%|###https://github.com/Project-MONAI/MONAILabel/issues/2 | 8/19 [00:22<00:30, 2.80s/it] 47%|###https://github.com/Project-MONAI/MONAILabel/issues/7 | 9/19 [00:25<00:27, 2.79s/it] 53%|####https://github.com/Project-MONAI/MONAILabel/issues/2 | 10/19 [00:27<00:25, 2.80s/it] 58%|####https://github.com/Project-MONAI/MONAILabel/issues/7 | 11/19 [00:30<00:22, 2.78s/it] 63%|#####https://github.com/Project-MONAI/MONAILabel/issues/3 | 12/19 [00:33<00:19, 2.78s/it] 68%|#####https://github.com/Project-MONAI/MONAILabel/issues/8 | 13/19 [00:36<00:16, 2.77s/it] 74%|######https://github.com/Project-MONAI/MONAILabel/issues/3 | 14/19 [00:39<00:13, 2.78s/it] 79%|######https://github.com/Project-MONAI/MONAILabel/issues/8 | 15/19 [00:41<00:11, 2.78s/it] 84%|#######https://github.com/Project-MONAI/MONAILabel/issues/4 | 16/19 [00:44<00:08, 2.79s/it] 89%|#######https://github.com/Project-MONAI/MONAILabel/issues/9 | 17/19 [00:47<00:05, 2.79s/it] 95%|########https://github.com/Project-MONAI/MONAILabel/issues/4| 18/19 [00:50<00:02, 2.83s/it] 100%|##########| 19/19 [00:53<00:00, 2.82s/it] 100%|##########| 19/19 [00:53<00:00, 2.79s/it] [2023-05-25 13:17:07,874] [17280] [MainThread] [INFO] (lib.trainers.hovernet_nuclei:103) - Final Records with hovernet patches: 931 [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:180) - Total Records in Dataset: 931; Validation Split: 0.2 [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:190) - Total Records for Training: 745 [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:191) - Total Records for Validation: 186 [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:225) - Using Multi GPU: True; GPUS: [0] [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:226) - CUDA_VISIBLE_DEVICES: None [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:229) - Using device: cuda; Type: <class 'str'> [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:240) - (Experiment Management) Tracking: mlflow [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:241) - (Experiment Management) Tracking URI: http://127.0.0.1[2023-05-25/ 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:242) - (Experiment Management) Experiment Name: hov_seg [2023-05-25 13:17:07,881] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:243) - (Experiment Management) Run Name: None [2023-05-25 13:17:07,905] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:307) - Using CUDA_VISIBLE_DEVICES: 0 [2023-05-25 13:17:07,905] [17280] [MainThread] [INFO] (lib.trainers.hovernet_nuclei:125) - +++++++++++ Running STAGE 0......................... [2023-05-25 13:17:07,905] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:357) - RUNNING COMMAND:: ['torchrun', '--standalone', '--nnodes=1', '--nproc_per_node=1', '-m', 'monai.bundle', 'run', 'train', '--meta_file', 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\metadata.json', '--config_file', "['C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\monailabel_train.json','C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\multi_gpu_train.json']", '--logging_file', 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\logging.conf', '--tracking', 'mlflow', '--tracking_uri', 'http://127.0.0.1', '--stage', '0', '--network_def#freeze_encoder', 'true', '--network_def#pretrained_url', 'file:///C:/Users/raj/apps/pathology/model/pathology_nuclei_segmentation_classification/models/stage0/model.pt'] failed to create process. [2023-05-25 13:17:07,915] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:365) - Return code: 0 [2023-05-25 13:17:07,915] [17280] [MainThread] [INFO] (lib.trainers.hovernet_nuclei:133) - +++++++++++ Running STAGE 1......................... [2023-05-25 13:17:07,916] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:357) - RUNNING COMMAND:: ['torchrun', '--standalone', '--nnodes=1', '--nproc_per_node=1', '-m', 'monai.bundle', 'run', 'train', '--meta_file', 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\metadata.json', '--config_file', "['C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\monailabel_train.json','C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\multi_gpu_train.json']", '--logging_file', 'C:\Users\raj\apps\pathology\model\pathology_nuclei_segmentation_classification\configs\logging.conf', '--tracking', 'mlflow', '--tracking_uri', 'http://127.0.0.1', '--stage', '1', '--network_def#freeze_encoder', 'false', '--network_def#pretrained_url', 'None'] failed to create process. [2023-05-25 13:17:07,921] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:365) - Return code: 0 [2023-05-25 13:17:07,922] [17280] [MainThread] [INFO] (monailabel.tasks.train.bundle:339) - Training Finished.... [2023-05-25 13:17:07,922] [17280] [MainThread] [INFO] (main:61) - Result: {} [2023-05-25 13:17:08,499] [11928] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:83) - Return code: 0

python -c 'import monai; monai.config.print_debug_info()' Here is the monaiconfig debug info:

================================ Printing MONAI config... MONAI version: 1.2.0rc7 Numpy version: 1.24.3 Pytorch version: 1.12.1 MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False MONAI rev id: e084885c6d498ab02df8107d23aa159f742a2cdd MONAI file: C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monaiinit.py

Optional dependencies: Pytorch Ignite version: 0.4.11 ITK version: 5.3.0 Nibabel version: 5.1.0 scikit-image version: 0.20.0 Pillow version: 9.4.0 Tensorboard version: 2.13.0 gdown version: 4.7.1 TorchVision version: 0.13.1 tqdm version: 4.65.0 lmdb version: 1.4.1 psutil version: 5.9.5 pandas version: 2.0.1 einops version: 0.6.1 transformers version: NOT INSTALLED or UNKNOWN VERSION. mlflow version: 2.3.2 pynrrd version: 1.0.0

For details about installing the optional dependencies, please visit: https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies

================================ Printing system config... System: Windows Win32 version: ('10', '10.0.17763', 'SP0', 'Multiprocessor Free') Win32 edition: EnterpriseS Platform: Windows-10-10.0.17763-SP0 Processor: Intel64 Family 6 Model 85 Stepping 7, GenuineIntel Machine: AMD64 Python version: 3.9.16 Process name: python.exe Command: ['python', '-c', 'import monai; monai.config.print_debug_info()'] Open files: [popenfile(path='C:\Windows\System32\en-US\kernel32.dll.mui', fd=-1), popenfile(path='C:\Windows\System32\en-US\KernelBase.dll.mui', fd=-1), popenfile(path='C:\Windows\System32\en-US\tzres.dll.mui', fd=-1)] Num physical CPUs: 8 Num logical CPUs: 16 Num usable CPUs: 16 CPU usage (%): [9.1, 0.9, 1.8, 0.4, 3.1, 0.9, 7.1, 0.4, 3.1, 2.2, 1.3, 2.2, 0.9, 0.9, 0.4, 81.9] CPU freq. (MHz): 3592 Load avg. in last 1, 5, 15 mins (%): [0.0, 0.0, 0.0] Disk usage (%): 65.5 Avg. sensor temp. (Celsius): UNKNOWN for given OS Total physical memory (GB): 63.7 Available memory (GB): 49.1 Used memory (GB): 14.6

================================ Printing GPU config... Num GPUs: 1 Has CUDA: True CUDA version: 11.3 cuDNN enabled: True cuDNN version: 8302 Current device: 0 Library compiled for CUDA architectures: ['sm_37', 'sm_50', 'sm_60', 'sm_61', 'sm_70', 'sm_75', 'sm_80', 'sm_86', 'compute_37'] GPU 0 Name: NVIDIA GeForce RTX 2080 Ti GPU 0 Is integrated: False GPU 0 Is multi GPU board: False GPU 0 Multi processor count: 68 GPU 0 Total memory (GB): 11.0 GPU 0 CUDA capability (maj.min): 7.5

yiheng-wang-nv commented 1 year ago

Hi @Rajeshtims , did you try to run this bundle directly (without through monailabel)? May need to debug step by step.

yiheng-wang-nv commented 1 year ago

Hi @Rajeshtims , just need to follow the readme file of: https://github.com/Project-MONAI/model-zoo/tree/dev/models/pathology_nuclei_segmentation_classification

rajeshtims commented 1 year ago

It fails to pickle when I run this command: python -m monai.bundle run --config_file configs/train.json --stage 0 --dataset_dir <actual dataset path> (I passed the directory accordingly) I tried changing cache rate in config.py to 0.1, kept num_workers:either 0 or 1; still fails to pickle.

When I run the the multigpu command below, it still fails to create the process. torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --batch_size 8 --network_def#freeze_encoder True --stage 0

------------ Here is the error for pickle error--------- (monailabel) X:\Rajesh\model-zoo\models\pathology_nuclei_segmentation_classification>python -m monai.bundle run --config_file configs/train.json --stage 0 --dataset_dir C:\Users\raj\Music\CoNSeP\Prepared 2023-05-26 09:04:52,235 - INFO - --- input summary of monai.bundle.scripts.run --- 2023-05-26 09:04:52,236 - INFO - > config_file: 'configs/train.json' 2023-05-26 09:04:52,236 - INFO - > stage: 0 2023-05-26 09:04:52,236 - INFO - > dataset_dir: 'C:\Users\raj\Music\CoNSeP\Prepared' 2023-05-26 09:04:52,236 - INFO - ---

2023-05-26 09:04:52,243 - INFO - Setting logging properties based on config: configs/logging.conf. monai.transforms.io.dictionary LoadImaged.init:image_only: Current default value of argument image_only=False has been deprecated since version 1.1. It will be changed to image_only=True in version 1.3. Loading dataset: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████| 1323/1323 [00:09<00:00, 145.31it/s] Loading dataset: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 686/686 [01:17<00:00, 8.82it/s] monai.handlers.stats_handler StatsHandler.init:name: Current default value of argument name=None has been deprecated since version 1.1. It will be changed to name=StatsHandler in version 1.3. 2023-05-26 09:06:20,060 - ignite.engine.engine.SupervisedTrainer - INFO - Engine run resuming from iteration 0, epoch 0 until 50 epochs Traceback (most recent call last): 2023-05-26 09:06:23,933 - ignite.engine.engine.SupervisedTrainer - ERROR - Engine run is terminating due to exception: Can't pickle <function at 0x0000000039811EE0>: attribute lookup on main failed File "", line 1, in File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\spawn.py", line 116, in spawn_main exitcode = _main(fd, parent_sentinel) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\spawn.py", line 126, in _main self = reduction.pickle.load(from_parent) EOFError: Ran out of input 2023-05-26 09:06:30,011 - ignite.engine.engine.SupervisedTrainer - INFO - Exception raised, saved the last checkpoint: .\models\stage0\model.pt Traceback (most recent call last): File "C:\Users\raj\Anaconda3\envs\monailabel\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\raj\Anaconda3\envs\monailabel\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle__main.py", line 30, in fire.Fire() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\fire\core.py", line 475, in _Fire component, remaining_args = _CallAndUpdateTrace( File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\fire\core.py", line 691, in _CallAndUpdateTrace component = fn(*varargs, *kwargs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\utils\deprecate_utils.py", line 223, in _wrapper return func(args, kwargs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\scripts.py", line 695, in run workflow.run() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\workflows.py", line 262, in run return self._run_expr(id=self.run_id) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\workflows.py", line 296, in _run_expr return self.parser.get_parsed_content(id, kwargs) if id in self.parser else None File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\config_parser.py", line 291, in get_parsed_content return self.ref_resolver.get_resolved_content(id=id, kwargs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\reference_resolver.py", line 190, in get_resolved_content return self._resolve_one_item(id=id, kwargs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\reference_resolver.py", line 160, in _resolve_one_item self._resolve_one_item(id=d, waiting_list=waiting_list, kwargs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\reference_resolver.py", line 172, in _resolve_one_item item.evaluate(globals={f"{self._vars}": self.resolved_content}) if run_eval else item File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\bundle\configitem.py", line 375, in evaluate return eval(value[len(self.prefix) :], globals, locals) File "", line 1, in File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\engines\trainer.py", line 53, in run super().run() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\engines\workflow.py", line 283, in run super().run(data=self.data_loader, max_epochs=self.state.max_epochs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 892, in run return self._internal_run() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 935, in _internal_run return next(self._internal_run_generator) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 993, in _internal_run_as_gen self._handle_exception(e) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 636, in _handle_exception self._fire_event(Events.EXCEPTION_RAISED, e) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 425, in _fire_event func(first, (event_args + others), kwargs) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\monai\handlers\checkpoint_saver.py", line 305, in exception_raised raise e File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 957, in _internal_run_as_gen self._setup_engine() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 922, in _setup_engine self._setup_dataloader_iter() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\ignite\engine\engine.py", line 919, in _setup_dataloader_iter self._dataloader_iter = iter(self.state.dataloader) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\torch\utils\data\dataloader.py", line 444, in iter return self._get_iterator() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\torch\utils\data\dataloader.py", line 390, in _get_iterator return _MultiProcessingDataLoaderIter(self) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\site-packages\torch\utils\data\dataloader.py", line 1077, in init w.start() File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\process.py", line 121, in start self._popen = self._Popen(self) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\context.py", line 327, in _Popen return Popen(process_obj) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\popen_spawn_win32.py", line 93, in init__ reduction.dump(process_obj, to_child) File "C:\Users\raj\Anaconda3\envs\monailabel\lib\multiprocessing\reduction.py", line 60, in dump ForkingPickler(file, protocol).dump(obj) _pickle.PicklingError: Can't pickle <function at 0x0000000039811EE0>: attribute lookup on main failed

--------------------------- Here is the multigpu error -----------------

(monailabel) X:\Rajesh\model-zoo\models\pathology_nuclei_segmentation_classification>torchrun --standalone --nnodes=1 --nproc_per_node=2 -m monai.bundle run --config_file "['configs/train.json','configs/multi_gpu_train.json']" --batch_size 8 --network_def#freeze_encoder True --stage 0 failed to create process.

I tried changing the dataset directory in the config file accordingly; that also didn't help

yiheng-wang-nv commented 1 year ago

Hi @Rajeshtims , the issue might come from the multiprocessing issue on Windows. Could you try to set num_workers of dataloader to 0, and retry the command? https://github.com/Project-MONAI/model-zoo/blob/b72b9fbe80e999407d4d136207778a02f37d3c6d/models/pathology_nuclei_segmentation_classification/configs/train.json#L287

rajeshtims commented 1 year ago

Thank you very much !! It worked like a charm.

I did the following:

I was getting "force_multi_gpu" = "true" in the training request sent to monailabel server even though I had unchecked multi_gpu option in monailabel extension in QuPath. And it kept complaining: "failed to create a process". Adding CUDA_VISIBLE_DEVICES=0 solved it.

I was getting pytorch memory allocation error; adding PYTORCH_CUDA_ALLOC_CONF and reducing the batch size solved it.

Then I was getting "Can't pickle lambda function at ..." error; Setting ALL num_workers=0 in the configuration file solved it. Looks like this pickle error is Windows specific.

Cheers !!