Project-MONAI / MONAILabel

MONAI Label is an intelligent open source image labeling and learning tool.
https://docs.monai.io/projects/label
Apache License 2.0
581 stars 189 forks source link

CUDA is not getting find by Monailabel #1337

Open keyurradia opened 1 year ago

keyurradia commented 1 year ago

Describe the bug I am using pc with gpu nvidia rtx 4070 with 12gb memory and is cuda enabeled. I have installed pytorch with cuda option. the Monailabel is still not able to find or work with cuda on the pc. can you please help with that.

Server logs 2023-03-13 23:19:59,209 - USING:: version = False 2023-03-13 23:19:59,209 - USING:: app = C:\Users\keyur\apps\radiology 2023-03-13 23:19:59,209 - USING:: studies = C:\Users\keyur\datasets\training 2023-03-13 23:19:59,209 - USING:: verbose = INFO 2023-03-13 23:19:59,209 - USING:: conf = [['models', 'segmentation']] 2023-03-13 23:19:59,210 - USING:: host = 0.0.0.0 2023-03-13 23:19:59,210 - USING:: port = 8000 2023-03-13 23:19:59,210 - USING:: uvicorn_app = monailabel.app:app 2023-03-13 23:19:59,210 - USING:: ssl_keyfile = None 2023-03-13 23:19:59,210 - USING:: ssl_certfile = None 2023-03-13 23:19:59,210 - USING:: ssl_keyfile_password = None 2023-03-13 23:19:59,211 - USING:: ssl_ca_certs = None 2023-03-13 23:19:59,211 - USING:: workers = None 2023-03-13 23:19:59,211 - USING:: limit_concurrency = None 2023-03-13 23:19:59,211 - USING:: access_log = False 2023-03-13 23:19:59,211 - USING:: log_config = None 2023-03-13 23:19:59,211 - USING:: dryrun = False 2023-03-13 23:19:59,212 - USING:: action = start_server 2023-03-13 23:19:59,212 - ENV SETTINGS:: MONAI_LABEL_API_STR = 2023-03-13 23:19:59,212 - ENV SETTINGS:: MONAI_LABEL_PROJECT_NAME = MONAILabel 2023-03-13 23:19:59,212 - ENV SETTINGS:: MONAI_LABEL_APP_DIR = 2023-03-13 23:19:59,212 - ENV SETTINGS:: MONAI_LABEL_STUDIES = 2023-03-13 23:19:59,212 - ENV SETTINGS:: MONAI_LABEL_AUTH_ENABLE = False 2023-03-13 23:19:59,212 - ENV SETTINGS:: MONAI_LABEL_AUTH_DB = 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_APP_CONF = '{}' 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_TASKS_TRAIN = True 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_TASKS_STRATEGY = True 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_TASKS_SCORING = True 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_TASKS_BATCH_INFER = True 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_DATASTORE = 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_URL = 2023-03-13 23:19:59,213 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_USERNAME = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_PASSWORD = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_API_KEY = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_CACHE_PATH = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_PROJECT = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_ASSET_PATH = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_DSA_ANNOTATION_GROUPS = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_USERNAME = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_PASSWORD = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_CACHE_PATH = 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_QIDO_PREFIX = None 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_WADO_PREFIX = None 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_STOW_PREFIX = None 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_FETCH_BY_FRAME = False 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_CONVERT_TO_NIFTI = True 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_SEARCH_FILTER = '{"Modality": "CT"}' 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_CACHE_EXPIRY = 180 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_PROXY_TIMEOUT = 30.0 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DICOMWEB_READ_TIMEOUT = 5.0 2023-03-13 23:19:59,214 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_AUTO_RELOAD = True 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_READ_ONLY = False 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_DATASTORE_FILE_EXT = '[".nii.gz", ".nii", ".nrrd", ".jpg", ".png", ".tif", ".svs", ".xml"]' 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_SERVER_PORT = 8000 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_CORS_ORIGINS = '[]' 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_SESSIONS = True 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_SESSION_PATH = 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_SESSION_EXPIRY = 3600 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_INFER_CONCURRENCY = -1 2023-03-13 23:19:59,215 - ENV SETTINGS:: MONAI_LABEL_INFER_TIMEOUT = 600 2023-03-13 23:19:59,216 - ENV SETTINGS:: MONAI_LABEL_TRACKING_ENABLED = True 2023-03-13 23:19:59,216 - ENV SETTINGS:: MONAI_LABEL_TRACKING_URI = 2023-03-13 23:19:59,216 - ENV SETTINGS:: MONAI_ZOO_SOURCE = github 2023-03-13 23:19:59,216 - ENV SETTINGS:: MONAI_ZOO_REPO = Project-MONAI/model-zoo/hosting_storage_v1 2023-03-13 23:19:59,219 - ENV SETTINGS:: MONAI_ZOO_AUTH_TOKEN = 2023-03-13 23:19:59,219 - ENV SETTINGS:: MONAI_LABEL_AUTO_UPDATE_SCORING = True 2023-03-13 23:19:59,219 - Allow Origins: [''] [2023-03-13 23:20:14,054] [3404] [MainThread] [INFO] (uvicorn.error:75) - Started server process [3404] [2023-03-13 23:20:14,056] [3404] [MainThread] [INFO] (uvicorn.error:45) - Waiting for application startup. [2023-03-13 23:20:14,058] [3404] [MainThread] [INFO] (monailabel.interfaces.utils.app:37) - Initializing App from: C:\Users\keyur\apps\radiology; studies: C:\Users\keyur\datasets\training; conf: {'models': 'segmentation'} [2023-03-13 23:20:14,274] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for MONAILabelApp Found: <class 'main.MyApp'> [2023-03-13 23:20:14,290] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.deepedit.DeepEdit'> [2023-03-13 23:20:14,292] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.deepgrow_2d.Deepgrow2D'> [2023-03-13 23:20:14,293] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.deepgrow_3d.Deepgrow3D'> [2023-03-13 23:20:14,296] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.localization_spine.LocalizationSpine'> [2023-03-13 23:20:14,300] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.localization_vertebra.LocalizationVertebra'> [2023-03-13 23:20:14,324] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation.Segmentation'> [2023-03-13 23:20:14,333] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation_spleen.SegmentationSpleen'> [2023-03-13 23:20:14,348] [3404] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation_vertebra.SegmentationVertebra'> [2023-03-13 23:20:14,362] [3404] [MainThread] [INFO] (main:93) - +++ Adding Model: segmentation => lib.configs.segmentation.Segmentation [2023-03-13 23:20:14,451] [3404] [MainThread] [INFO] (main:96) - +++ Using Models: ['segmentation'] [2023-03-13 23:20:14,459] [3404] [MainThread] [INFO] (monailabel.interfaces.app:134) - Init Datastore for: C:\Users\keyur\datasets\training [2023-03-13 23:20:14,472] [3404] [MainThread] [INFO] (monailabel.datastore.local:130) - Auto Reload: True; Extensions: ['.nii.gz', '.nii', '.nrrd', '.jpg', '.png', '.tif', '.svs', '*.xml'] [2023-03-13 23:20:14,507] [3404] [MainThread] [INFO] (monailabel.datastore.local:577) - Invalidate count: 0 [2023-03-13 23:20:14,519] [3404] [MainThread] [INFO] (monailabel.datastore.local:151) - Start observing external modifications on datastore (AUTO RELOAD) [2023-03-13 23:20:14,538] [3404] [MainThread] [INFO] (main:126) - +++ Adding Inferer:: segmentation => <lib.infers.segmentation.Segmentation object at 0x000001F12698C2E0> [2023-03-13 23:20:14,539] [3404] [MainThread] [INFO] (main:191) - {'segmentation': <lib.infers.segmentation.Segmentation object at 0x000001F12698C2E0>, 'Histogram+GraphCut': <monailabel.scribbles.infer.HistogramBasedGraphCut object at 0x000001F12AB54580>, 'GMM+GraphCut': <monailabel.scribbles.infer.GMMBasedGraphCut object at 0x000001F12AB54550>} [2023-03-13 23:20:14,540] [3404] [MainThread] [INFO] (main:206) - +++ Adding Trainer:: segmentation => <lib.trainers.segmentation.Segmentation object at 0x000001F12AB7E6A0> [2023-03-13 23:20:14,542] [3404] [MainThread] [INFO] (monailabel.utils.sessions:51) - Session Path: C:\Users\keyur.cache\monailabel\sessions [2023-03-13 23:20:14,543] [3404] [MainThread] [INFO] (monailabel.utils.sessions:52) - Session Expiry (max): 3600 [2023-03-13 23:20:14,543] [3404] [MainThread] [INFO] (monailabel.interfaces.app:468) - App Init - completed [2023-03-13 23:20:14,544] [timeloop] [INFO] Starting Timeloop.. [2023-03-13 23:20:14,544] [3404] [MainThread] [INFO] (timeloop:60) - Starting Timeloop.. [2023-03-13 23:20:14,549] [timeloop] [INFO] Registered job <function MONAILabelApp.on_init_complete..run_scheduler at 0x000001F1269F0700> [2023-03-13 23:20:14,549] [3404] [MainThread] [INFO] (timeloop:42) - Registered job <function MONAILabelApp.on_init_complete..run_scheduler at 0x000001F1269F0700> [2023-03-13 23:20:14,549] [timeloop] [INFO] Timeloop now started. Jobs will run based on the interval set [2023-03-13 23:20:14,549] [3404] [MainThread] [INFO] (timeloop:63) - Timeloop now started. Jobs will run based on the interval set [2023-03-13 23:20:14,550] [3404] [MainThread] [INFO] (uvicorn.error:59) - Application startup complete. [2023-03-13 23:20:14,552] [3404] [MainThread] [INFO] (uvicorn.error:206) - Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) [2023-03-13 23:20:36,423] [3404] [MainThread] [INFO] (monailabel.endpoints.activelearning:43) - Active Learning Request: {'strategy': 'random', 'client_id': 'user-xyz'} [2023-03-13 23:21:50,060] [3404] [MainThread] [INFO] (monailabel.endpoints.datastore:67) - Image: 23.02.14.17; File: <starlette.datastructures.UploadFile object at 0x000001F12AF3F7F0>; params: {"client_id": "user-xyz"} [2023-03-13 23:21:50,116] [3404] [MainThread] [INFO] (monailabel.datastore.local:439) - Adding Image: 23.02.14.17 => C:\Users\keyur\AppData\Local\Temp\tmpv0s5oyc5.nii.gz [2023-03-13 23:21:50,698] [3404] [MainThread] [INFO] (monailabel.endpoints.datastore:100) - Saving Label for 23.02.14.17 for tag: final by admin [2023-03-13 23:21:50,702] [3404] [MainThread] [INFO] (monailabel.endpoints.datastore:111) - Save Label params: {"label_info": [{"name": "liver", "idx": 1}, {"name": "venaporta", "idx": 2}, {"name": "livervein", "idx": 3}, {"name": "venacava", "idx": 4}, {"name": "tumor", "idx": 5}], "client_id": "user-xyz"} [2023-03-13 23:21:50,703] [3404] [MainThread] [INFO] (monailabel.datastore.local:486) - Saving Label for Image: 23.02.14.17; Tag: final; Info: {'label_info': [{'name': 'liver', 'idx': 1}, {'name': 'venaporta', 'idx': 2}, {'name': 'livervein', 'idx': 3}, {'name': 'venacava', 'idx': 4}, {'name': 'tumor', 'idx': 5}], 'clientid': 'user-xyz'} [2023-03-13 23:21:50,704] [3404] [MainThread] [INFO] (monailabel.datastore.local:494) - Adding Label: 23.02.14.17 => final => C:\Users\keyur\AppData\Local\Temp\tmp2qgif4r.nii.gz [2023-03-13 23:21:50,710] [3404] [MainThread] [INFO] (monailabel.datastore.local:510) - Label Info: {'label_info': [{'name': 'liver', 'idx': 1}, {'name': 'venaporta', 'idx': 2}, {'name': 'livervein', 'idx': 3}, {'name': 'venacava', 'idx': 4}, {'name': 'tumor', 'idx': 5}], 'client_id': 'user-xyz', 'ts': 1678746110, 'name': '23.02.14.17.nii.gz'} [2023-03-13 23:21:50,715] [3404] [MainThread] [INFO] (monailabel.interfaces.app:492) - New label saved for: 23.02.14.17 => 23.02.14.17 [2023-03-13 23:21:50,775] [3404] [Thread-1] [INFO] (monailabel.datastore.local:624) - Adding New Label: final5 => 23.02.14.17 => 23.02.14.17.nii [2023-03-13 23:21:50,780] [3404] [Thread-1] [INFO] (monailabel.datastore.local:577) - Invalidate count: 1 [2023-03-13 23:21:55,028] [3404] [MainThread] [INFO] (monailabel.utils.async_tasks.task:41) - Train request: {'model': 'segmentation', 'name': 'train_01', 'pretrained': True, 'device': 'cuda', 'max_epochs': 50, 'early_stop_patience': -1, 'val_split': 0.2, 'train_batch_size': 1, 'val_batch_size': 1, 'multi_gpu': True, 'gpus': 'all', 'dataset': 'SmartCacheDataset', 'dataloader': 'ThreadDataLoader', 'tracking': 'mlflow', 'tracking_uri': '', 'tracking_experiment_name': '', 'client_id': 'user-xyz'} [2023-03-13 23:21:55,039] [3404] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:49) - Before:: C:\Users\keyur.conda\envs; [2023-03-13 23:21:55,041] [3404] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:53) - After:: C:\Users\keyur.conda\envs; [2023-03-13 23:21:55,042] [3404] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:65) - COMMAND:: C:\Users\keyur.conda\envs\monailabel-env\python.exe -m monailabel.interfaces.utils.app -m train -r {"model":"segmentation","name":"train_01","pretrained":true,"device":"cuda","max_epochs":50,"early_stop_patience":-1,"val_split":0.2,"train_batch_size":1,"val_batch_size":1,"multi_gpu":true,"gpus":"all","dataset":"SmartCacheDataset","dataloader":"ThreadDataLoader","tracking":"mlflow","tracking_uri":"","tracking_experiment_name":"","client_id":"user-xyz"} [2023-03-13 23:21:55,668] [22344] [MainThread] [INFO] (main:37) - Initializing App from: C:\Users\keyur\apps\radiology; studies: C:\Users\keyur\datasets\training; conf: {'models': 'segmentation'} [2023-03-13 23:21:59,895] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for MONAILabelApp Found: <class 'main.MyApp'> [2023-03-13 23:21:59,903] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.deepedit.DeepEdit'> [2023-03-13 23:21:59,903] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.deepgrow_2d.Deepgrow2D'> [2023-03-13 23:21:59,904] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.deepgrow_3d.Deepgrow3D'> [2023-03-13 23:21:59,905] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.localization_spine.LocalizationSpine'> [2023-03-13 23:21:59,905] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.localization_vertebra.LocalizationVertebra'> [2023-03-13 23:21:59,906] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation.Segmentation'> [2023-03-13 23:21:59,907] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation_spleen.SegmentationSpleen'> [2023-03-13 23:21:59,907] [22344] [MainThread] [INFO] (monailabel.utils.others.class_utils:57) - Subclass for TaskConfig Found: <class 'lib.configs.segmentation_vertebra.SegmentationVertebra'> [2023-03-13 23:21:59,907] [22344] [MainThread] [INFO] (main:93) - +++ Adding Model: segmentation => lib.configs.segmentation.Segmentation [2023-03-13 23:21:59,974] [22344] [MainThread] [INFO] (main:96) - +++ Using Models: ['segmentation'] [2023-03-13 23:21:59,975] [22344] [MainThread] [INFO] (monailabel.interfaces.app:134) - Init Datastore for: C:\Users\keyur\datasets\training [2023-03-13 23:21:59,975] [22344] [MainThread] [INFO] (monailabel.datastore.local:130) - Auto Reload: False; Extensions: ['.nii.gz', '.nii', '.nrrd', '.jpg', '.png', '.tif', '.svs', '.xml'] [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (monailabel.datastore.local:577) - Invalidate count: 0 [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (main:126) - +++ Adding Inferer:: segmentation => <lib.infers.segmentation.Segmentation object at 0x0000013C3E77D9D0> [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (main:191) - {'segmentation': <lib.infers.segmentation.Segmentation object at 0x0000013C3E77D9D0>, 'Histogram+GraphCut': <monailabel.scribbles.infer.HistogramBasedGraphCut object at 0x0000013C3FC5FF40>, 'GMM+GraphCut': <monailabel.scribbles.infer.GMMBasedGraphCut object at 0x0000013C3FC5FF10>} [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (main:206) - +++ Adding Trainer:: segmentation => <lib.trainers.segmentation.Segmentation object at 0x0000013C3FC5FFA0> [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (monailabel.utils.sessions:51) - Session Path: C:\Users\keyur.cache\monailabel\sessions [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (monailabel.utils.sessions:52) - Session Expiry (max): 3600 [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:432) - Train Request (input): {'model': 'segmentation', 'name': 'train_01', 'pretrained': True, 'device': 'cuda', 'max_epochs': 50, 'early_stop_patience': -1, 'val_split': 0.2, 'train_batch_size': 1, 'val_batch_size': 1, 'multi_gpu': True, 'gpus': 'all', 'dataset': 'SmartCacheDataset', 'dataloader': 'ThreadDataLoader', 'tracking': 'mlflow', 'tracking_uri': '', 'tracking_experiment_name': '', 'client_id': 'user-xyz', 'local_rank': 0} [2023-03-13 23:21:59,986] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:442) - CUDA_VISIBLE_DEVICES: None [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:447) - Distributed/Multi GPU is limited [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:462) - Distributed Training = FALSE [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:489) - 0 - Train Request (final): {'name': 'train_01', 'pretrained': True, 'device': 'cuda', 'max_epochs': 50, 'early_stop_patience': -1, 'val_split': 0.2, 'train_batch_size': 1, 'val_batch_size': 1, 'multi_gpu': False, 'gpus': 'all', 'dataset': 'SmartCacheDataset', 'dataloader': 'ThreadDataLoader', 'tracking': 'mlflow', 'tracking_uri': '', 'tracking_experiment_name': '', 'model': 'segmentation', 'client_id': 'user-xyz', 'local_rank': 0, 'run_id': '20230313_232159'} [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:622) - 0 - Using Device: cpu; IDX: None [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:515) - Run/Output Path: C:\Users\keyur\apps\radiology\model\segmentation\train_01 [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:531) - Tracking: mlflow [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:532) - Tracking URI: file:///C:/Users/keyur/apps/radiology/model/segmentation/train_01/mlruns; [2023-03-13 23:21:59,988] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:533) - Tracking Experiment Name: segmentation; Run Name: run_20230313_232159 [2023-03-13 23:21:59,989] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:410) - Total Records for Training: 7 [2023-03-13 23:21:59,989] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:411) - Total Records for Validation: 2 Loading dataset: 0%| | 0/2 [00:00<?, ?it/s] Loading dataset: 50%|##### | 1/2 [00:04<00:04, 4.95s/it] Loading dataset: 100%|##########| 2/2 [00:08<00:00, 4.10s/it] Loading dataset: 100%|##########| 2/2 [00:08<00:00, 4.23s/it] cache_num is greater or equal than dataset length, fall back to regular monai.data.CacheDataset. [2023-03-13 23:22:08,462] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:328) - 0 - Records for Validation: 2 [2023-03-13 23:22:08,468] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:318) - 0 - Adding Validation to run every '1' interval [2023-03-13 23:22:08,471] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:710) - 0 - Load Path C:\Users\keyur\apps\radiology\model\pretrained_segmentation.pt Loading dataset: 0%| | 0/7 [00:00<?, ?it/s] Loading dataset: 14%|#4 | 1/7 [00:07<00:47, 7.93s/it] Loading dataset: 29%|##8 | 2/7 [00:13<00:33, 6.69s/it] Loading dataset: 43%|####2 | 3/7 [00:19<00:25, 6.47s/it] Loading dataset: 57%|#####7 | 4/7 [00:23<00:15, 5.14s/it] Loading dataset: 71%|#######1 | 5/7 [00:28<00:10, 5.27s/it] Loading dataset: 86%|########5 | 6/7 [00:35<00:05, 5.82s/it] Loading dataset: 100%|##########| 7/7 [00:42<00:00, 6.14s/it] Loading dataset: 100%|##########| 7/7 [00:42<00:00, 6.04s/it] [2023-03-13 23:22:50,735] [22344] [MainThread] [INFO] (monailabel.tasks.train.basic_train:264) - 0 - Records for Training: 7 torch.cuda.amp.GradScaler is enabled, but CUDA is not available. Disabling. [2023-03-13 23:22:50,750] [22344] [MainThread] [INFO] (ignite.engine.engine.SupervisedTrainer:876) - Engine run resuming from iteration 0, epoch 0 until 50 epochs [2023-03-13 23:22:50,766] [22344] [MainThread] [ERROR] (ignite.engine.engine.SupervisedTrainer:992) - Engine run is terminating due to exception: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. [2023-03-13 23:22:50,766] [22344] [MainThread] [ERROR] (ignite.engine.engine.SupervisedTrainer:180) - Exception: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. Traceback (most recent call last): File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 946, in _internal_run_as_gen self._fire_event(Events.STARTED) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 425, in _fire_event func(first, (event_args + others), kwargs) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monai\handlers\checkpoint_loader.py", line 107, in call checkpoint = torch.load(self.load_path, map_location=self.map_location) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 789, in load return _load(opened_zipfile, map_location, pickle_module, pickle_load_args) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 1131, in _load result = unpickler.load() File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 1101, in persistent_load load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 1083, in load_tensor wrap_storage=restore_location(storage, location), File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 215, in default_restore_location result = fn(storage, location) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize device = validate_cuda_device(location) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. Traceback (most recent call last): File "C:\Users\keyur.conda\envs\monailabel-env\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\keyur.conda\envs\monailabel-env\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monailabel\interfaces\utils\app.py", line 128, in run_main() File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monailabel\interfaces\utils\app.py", line 113, in run_main result = a.train(request) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monailabel\interfaces\app.py", line 422, in train result = task(request, self.datastore()) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monailabel\tasks\train\basic_train.py", line 463, in call res = self.train(0, world_size, req, datalist) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monailabel\tasks\train\basic_train.py", line 552, in train context.trainer.run() File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monai\engines\trainer.py", line 53, in run super().run() File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monai\engines\workflow.py", line 281, in run super().run(data=self.data_loader, max_epochs=self.state.max_epochs) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 892, in run return self._internal_run() File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 935, in _internal_run return next(self._internal_run_generator) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 993, in _internal_run_as_gen self._handle_exception(e) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 636, in _handle_exception self._fire_event(Events.EXCEPTION_RAISED, e) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 425, in _fire_event func(first, (event_args + others), kwargs) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monai\handlers\stats_handler.py", line 181, in exception_raised raise e File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 946, in _internal_run_as_gen self._fire_event(Events.STARTED) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\ignite\engine\engine.py", line 425, in _fire_event func(first, (event_args + others), kwargs) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\monai\handlers\checkpoint_loader.py", line 107, in call checkpoint = torch.load(self.load_path, map_location=self.map_location) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 789, in load return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 1131, in _load result = unpickler.load() File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 1101, in persistent_load load_tensor(dtype, nbytes, key, _maybe_decode_ascii(location)) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 1083, in load_tensor wrap_storage=restore_location(storage, location), File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 215, in default_restore_location result = fn(storage, location) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 182, in _cuda_deserialize device = validate_cuda_device(location) File "C:\Users\keyur.conda\envs\monailabel-env\lib\site-packages\torch\serialization.py", line 166, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU. [2023-03-13 23:22:51,452] [3404] [ThreadPoolExecutor-2_0] [INFO] (monailabel.utils.async_tasks.utils:83) - Return code: 1

SachidanandAlle commented 1 year ago

Looks an issue with your python or torch version.. irrespective of monailabel try to import torch and see if cuda is enabled

keyurradia commented 1 year ago

Thank you for your response #SachidanandAlle . It is an issue with torch as you have described. I tried to install the latest version of torch in the conda environment and showing the torch==2.0.0, torchaudio==2.0.1, torchvision==0.15.1. When I install the monailabel it is uninstalling torch 2.0 and torchvision 0.15.1 where torchaudio 2.0.1 is remaining and at the end showing the error :"ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torchaudio 2.0.1 requires torch==2.0.0, but you have torch 1.13.1 which is incompatible."

can someone please guide me which torch version and cuda version is compatible with monail label?

SachidanandAlle commented 1 year ago

I have installed torch from pip and I didn't have any such dependency issue. But otherwise monailabel doesn't restrict to use specific torch version.. however monai is tested and verified for certain torch versions only..

https://github.com/Project-MONAI/MONAI/blob/dev/requirements.txt#L1

Possibly try a new venv and try from beginning.. It mostly looks an environment issue at your end.

keyurradia commented 1 year ago

Thank you for your response,

I did try with new venv but it was showing the same error. Then I did follow the youtube instruction from https://www.youtube.com/watch?v=8y1OBQs2wis&t=141s https://www.youtube.com/watch?v=8y1OBQs2wis&t=141s and it did work at the end. I have also noted that monailabel installation process did uninstall the torch 2.0 with torchvision 0.15. Beste Hilsen / Kind regards,

"ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. torchaudio 2.0.1+cu117 requires torch==2.0.0, but you have torch 1.13.1 which is incompatible."

then I used the command to install pytorch again and it did uninstall the torchaudio 2.0 and installed 1.3 as are compatible.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117 Looking in indexes: https://download.pytorch.org/whl/cu117 Requirement already satisfied: torch in c:\users\keyur.conda\envs\monai\lib\site-packages (1.13.1) Requirement already satisfied: torchvision in c:\users\keyur.conda\envs\monai\lib\site-packages (0.14.1) Requirement already satisfied: torchaudio in c:\users\keyur.conda\envs\monai\lib\site-packages (2.0.1+cu117) Requirement already satisfied: typing-extensions in c:\users\keyur.conda\envs\monai\lib\site-packages (from torch) (4.4.0) Requirement already satisfied: numpy in c:\users\keyur.conda\envs\monai\lib\site-packages (from torchvision) (1.23.5) Requirement already satisfied: requests in c:\users\keyur.conda\envs\monai\lib\site-packages (from torchvision) (2.28.1) Requirement already satisfied: pillow!=8.3.*,>=5.3.0 in c:\users\keyur.conda\envs\monai\lib\site-packages (from torchvision) (9.3.0) Collecting torchaudio Downloading https://download.pytorch.org/whl/cu117/torchaudio-2.0.0%2Bcu117-cp39-cp39-win_amd64.whl (2.5 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 11.2 MB/s eta 0:00:00 Downloading https://download.pytorch.org/whl/cu117/torchaudio-0.13.1%2Bcu117-cp39-cp39-win_amd64.whl (2.3 MB) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.3/2.3 MB 14.6 MB/s eta 0:00:00 Requirement already satisfied: certifi>=2017.4.17 in c:\users\keyur.conda\envs\monai\lib\site-packages (from requests->torchvision) (2022.12.7) Requirement already satisfied: idna<4,>=2.5 in c:\users\keyur.conda\envs\monai\lib\site-packages (from requests->torchvision) (3.4) Requirement already satisfied: urllib3<1.27,>=1.21.1 in c:\users\keyur.conda\envs\monai\lib\site-packages (from requests->torchvision) (1.26.13) Requirement already satisfied: charset-normalizer<3,>=2 in c:\users\keyur.conda\envs\monai\lib\site-packages (from requests->torchvision) (2.1.1) Installing collected packages: torchaudio Attempting uninstall: torchaudio Found existing installation: torchaudio 2.0.1+cu117 Uninstalling torchaudio-2.0.1+cu117: Successfully uninstalled torchaudio-2.0.1+cu117 Successfully installed torchaudio-0.13.1+cu117 (monai) PS C:\Users\keyur\monailabel> .monailabel/scripts/monailabel apps

Keyur Radiya

On Mon, Mar 20, 2023 at 7:16 AM SACHIDANAND ALLE @.***> wrote:

I have installed torch from pip and I didn't have any such dependency issue. But otherwise monailabel doesn't restrict to use specific torch version.. however monai is tested and verified for certain torch versions only..

https://github.com/Project-MONAI/MONAI/blob/dev/requirements.txt#L1

Possibly try a new venv and try from beginning.. It mostly looks an environment issue at your end.

— Reply to this email directly, view it on GitHub https://github.com/Project-MONAI/MONAILabel/issues/1337#issuecomment-1475681804, or unsubscribe https://github.com/notifications/unsubscribe-auth/AZHCD2AGXWDXH5E7QIBEY53W47Y33ANCNFSM6AAAAAAVZVWEOM . You are receiving this because you authored the thread.Message ID: @.***>