Closed IcyFeather233 closed 5 months ago
It looks like a cuda library version incompatibility issue.
1. mmcv verstion should be: mmcv>=2.0.0rc4, <2.1.0, or it will shows error like this:
File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/basemodel-simple.py", line 14, in <module>
from RFNet.eval import Validator, load_my_state_dict
File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/RFNet/eval.py", line 13, in <module>
from mmdet.visualization.image import imshow_det_bboxes
File "/home/icyfeather/project/mmdetection/mmdet/__init__.py", line 16, in <module>
assert (mmcv_version >= digit_version(mmcv_minimum_version)
AssertionError: MMCV==2.2.0 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.1.0.
2. mmcv is heavily relying on the versions of the PyTorch and Cuda installed. The installation of mmcv should ref to this: https://mmcv.readthedocs.io/zh-cn/latest/get_started/installation.html#install-with-pip
For example, cuda 11.8 and torch 2.1.x cannot install suitable mmcv as shown below, cannot match the requirement mmcv<2.1.0
In the doc, the installation step is python -m pip install https://download.openmmlab.com/mmcv/dist/cu118/torch2.0.0/mmcv-2.0.0-cp39-cp39-manylinux1_x86_64.whl
which
is too simple and may mislead someone who don't use cuda 11.8, torch 2.0.0 and python3.9
conclusion You should find a cuda-torch-pair in https://mmcv.readthedocs.io/zh-cn/latest/get_started/installation.html#install-with-pip which support the installation of mmcv>=2.0.0rc4, <2.1.0, and change your current torch or cuda version.
1. mmcv verstion should be: mmcv>=2.0.0rc4, <2.1.0, or it will shows error like this:
File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/basemodel-simple.py", line 14, in <module> from RFNet.eval import Validator, load_my_state_dict File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/RFNet/eval.py", line 13, in <module> from mmdet.visualization.image import imshow_det_bboxes File "/home/icyfeather/project/mmdetection/mmdet/__init__.py", line 16, in <module> assert (mmcv_version >= digit_version(mmcv_minimum_version) AssertionError: MMCV==2.2.0 is used but incompatible. Please install mmcv>=2.0.0rc4, <2.1.0.
2. mmcv is heavily relying on the versions of the PyTorch and Cuda installed. The installation of mmcv should ref to this: https://mmcv.readthedocs.io/zh-cn/latest/get_started/installation.html#install-with-pip
For example, cuda 11.8 and torch 2.1.x cannot install suitable mmcv as shown below, cannot match the requirement mmcv<2.1.0
In the doc, the installation step is
python -m pip install https://download.openmmlab.com/mmcv/dist/cu118/torch2.0.0/mmcv-2.0.0-cp39-cp39-manylinux1_x86_64.whl
which is too simple and may mislead someone who don't use cuda 11.8, torch 2.0.0 and python3.9conclusion You should find a cuda-torch-pair in https://mmcv.readthedocs.io/zh-cn/latest/get_started/installation.html#install-with-pip which support the installation of mmcv>=2.0.0rc4, <2.1.0, and change your current torch or cuda version.
Good job! (P.S. mmcv is installed only for visualization of semantic segmentation results. If visualization is not required, you could annotate out all the mmcv content.)
My env info(successfully run ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
and no requirements problem):
Python 3.9
Cuda 11.8
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
pip install mmcv==2.0.1 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.0/index.html
However I met another strange problem:
if I run
import torch
if torch.cuda.is_available():
print("CUDA is available! You can use GPU acceleration.")
# Get the number of available GPUs
num_gpus = torch.cuda.device_count()
print(f"Number of GPUs available: {num_gpus}")
else:
print("CUDA is not available.")
in project root dir ~/project/ianvs/
, it shows:
(ianvs) icyfeather@gpu:~/project/ianvs$ python test_cuda.py
CUDA is available! You can use GPU acceleration.
Number of GPUs available: 1
If I add this to ~/project/ianvs/examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/RFNet/eval.py
:
class Validator(object):
def __init__(self, args, data=None, unseen_detection=False):
self.args = args
self.time_train = []
self.num_class = args.num_class
# Define Dataloader
kwargs = {'num_workers': args.workers, 'pin_memory': False}
# _, self.val_loader, _, self.custom_loader, self.num_class = make_data_loader(args, **kwargs)
_, _, self.test_loader, _ = make_data_loader(args, test_data=data, **kwargs)
print('un_classes:'+str(self.num_class))
# Define evaluator
self.evaluator = Evaluator(self.num_class)
if torch.cuda.is_available():
print("CUDA is available! You can use GPU acceleration.")
# Get the number of available GPUs
num_gpus = torch.cuda.device_count()
print(f"Number of GPUs available: {num_gpus}")
else:
print("CUDA is not available.")
and when I run ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
, it shows:
(ianvs) icyfeather@gpu:~/project/ianvs$ ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
un_classes:30
CUDA is not available.
Upsample layer: in = 128, skip = 64, out = 128
Upsample layer: in = 128, skip = 128, out = 128
Upsample layer: in = 128, skip = 256, out = 128
128
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/module/module.py", line 114, in get_module_instance
func = ClassFactory.get_cls(
File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/basemodel-simple.py", line 36, in __init__
self.validator = Validator(self.val_args)
File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/RFNet/eval.py", line 65, in __init__
self.model = self.model.cuda(args.gpu_ids)
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 905, in cuda
return self._apply(lambda t: t.cuda(device))
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply
module._apply(fn)
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply
param_applied = fn(param)
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 905, in <lambda>
return self._apply(lambda t: t.cuda(device))
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init
torch._C._cuda_init()
RuntimeError: No CUDA GPUs are available
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 72, in run
paradigm = self.algorithm.paradigm(workspace=self.output_dir,
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/algorithm.py", line 105, in paradigm
return LifelongLearning(workspace, **config)
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py", line 58, in __init__
ParadigmBase.__init__(self, workspace, **kwargs)
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/base.py", line 55, in __init__
self.module_instances = self._get_module_instances()
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/base.py", line 75, in _get_module_instances
func = module.get_module_instance(module_type)
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/module/module.py", line 119, in get_module_instance
raise RuntimeError(f"module(type={module_type} loads class(name={self.name}) "
RuntimeError: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 54, in run_testcases
res, time = (testcase.run(workspace), utils.get_local_time())
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 79, in run
raise RuntimeError(
RuntimeError: (paradigm=lifelonglearning) pipeline runs failed, error: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 37, in main
job.run()
File "/home/icyfeather/project/ianvs/core/cmd/obj/benchmarkingjob.py", line 93, in run
succeed_testcases, test_results = self.testcase_controller.run_testcases(self.workspace)
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 56, in run_testcases
raise RuntimeError(f"testcase(id={testcase.id}) runs failed, error: {err}") from err
RuntimeError: testcase(id=7438c7a6-20f1-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/miniconda3/envs/ianvs/bin/ianvs", line 33, in <module>
sys.exit(load_entry_point('ianvs==0.1.0', 'console_scripts', 'ianvs')())
File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 41, in main
raise RuntimeError(f"benchmarkingjob runs failed, error: {err}.") from err
RuntimeError: benchmarkingjob runs failed, error: testcase(id=7438c7a6-20f1-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available..
So THERE EXISTS CUDA, but when I run ianvs -f xxx
, it disappears. I wonder why.
My env info(successfully run
ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
and no requirements problem):Python 3.9 Cuda 11.8 pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118 pip install mmcv==2.0.1 -f https://download.openmmlab.com/mmcv/dist/cu118/torch2.0/index.html
However I met another strange problem:
if I run
import torch if torch.cuda.is_available(): print("CUDA is available! You can use GPU acceleration.") # Get the number of available GPUs num_gpus = torch.cuda.device_count() print(f"Number of GPUs available: {num_gpus}") else: print("CUDA is not available.")
in project root dir
~/project/ianvs/
, it shows:(ianvs) icyfeather@gpu:~/project/ianvs$ python test_cuda.py CUDA is available! You can use GPU acceleration. Number of GPUs available: 1
If I add this to
~/project/ianvs/examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/RFNet/eval.py
:class Validator(object): def __init__(self, args, data=None, unseen_detection=False): self.args = args self.time_train = [] self.num_class = args.num_class # Define Dataloader kwargs = {'num_workers': args.workers, 'pin_memory': False} # _, self.val_loader, _, self.custom_loader, self.num_class = make_data_loader(args, **kwargs) _, _, self.test_loader, _ = make_data_loader(args, test_data=data, **kwargs) print('un_classes:'+str(self.num_class)) # Define evaluator self.evaluator = Evaluator(self.num_class) if torch.cuda.is_available(): print("CUDA is available! You can use GPU acceleration.") # Get the number of available GPUs num_gpus = torch.cuda.device_count() print(f"Number of GPUs available: {num_gpus}") else: print("CUDA is not available.")
and when I run
ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
, it shows:(ianvs) icyfeather@gpu:~/project/ianvs$ ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml un_classes:30 CUDA is not available. Upsample layer: in = 128, skip = 64, out = 128 Upsample layer: in = 128, skip = 128, out = 128 Upsample layer: in = 128, skip = 256, out = 128 128 Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/module/module.py", line 114, in get_module_instance func = ClassFactory.get_cls( File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/basemodel-simple.py", line 36, in __init__ self.validator = Validator(self.val_args) File "/home/icyfeather/project/ianvs/./examples/robot/lifelong_learning_bench/semantic-segmentation/testalgorithms/rfnet/RFNet/eval.py", line 65, in __init__ self.model = self.model.cuda(args.gpu_ids) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 905, in cuda return self._apply(lambda t: t.cuda(device)) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 797, in _apply module._apply(fn) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 820, in _apply param_applied = fn(param) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/nn/modules/module.py", line 905, in <lambda> return self._apply(lambda t: t.cuda(device)) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/torch/cuda/__init__.py", line 247, in _lazy_init torch._C._cuda_init() RuntimeError: No CUDA GPUs are available The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 72, in run paradigm = self.algorithm.paradigm(workspace=self.output_dir, File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/algorithm.py", line 105, in paradigm return LifelongLearning(workspace, **config) File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py", line 58, in __init__ ParadigmBase.__init__(self, workspace, **kwargs) File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/base.py", line 55, in __init__ self.module_instances = self._get_module_instances() File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/base.py", line 75, in _get_module_instances func = module.get_module_instance(module_type) File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/module/module.py", line 119, in get_module_instance raise RuntimeError(f"module(type={module_type} loads class(name={self.name}) " RuntimeError: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 54, in run_testcases res, time = (testcase.run(workspace), utils.get_local_time()) File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 79, in run raise RuntimeError( RuntimeError: (paradigm=lifelonglearning) pipeline runs failed, error: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 37, in main job.run() File "/home/icyfeather/project/ianvs/core/cmd/obj/benchmarkingjob.py", line 93, in run succeed_testcases, test_results = self.testcase_controller.run_testcases(self.workspace) File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 56, in run_testcases raise RuntimeError(f"testcase(id={testcase.id}) runs failed, error: {err}") from err RuntimeError: testcase(id=7438c7a6-20f1-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/miniconda3/envs/ianvs/bin/ianvs", line 33, in <module> sys.exit(load_entry_point('ianvs==0.1.0', 'console_scripts', 'ianvs')()) File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 41, in main raise RuntimeError(f"benchmarkingjob runs failed, error: {err}.") from err RuntimeError: benchmarkingjob runs failed, error: testcase(id=7438c7a6-20f1-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: module(type=basemodel loads class(name=BaseModel) failed, error: No CUDA GPUs are available..
So THERE EXISTS CUDA, but when I run
ianvs -f xxx
, it disappears. I wonder why.
Maybe it's caused by the "os.environ['CUDA_VISIBLE_DEVICES'] = '1'".
Thanks so much! delete "os.environ['CUDA_VISIBLE_DEVICES'] = '1'" and it works.
btw, I don't know why there is "os.environ['CUDA_VISIBLE_DEVICES'] = '1'", is it necessary for some reason?
Another problem, when I run ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
:
...(many lines)
CPA:0.07153843458791581, mIoU:0.005019460125166828, fwIoU: 0.03151069938676472
Found 50 test RGB images
Found 50 test disparity images
: 0%| | 0/50 [00:00<?, ?it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 8%|████████▉ | 4/50 [00:00<00:01, 39.04it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 16%|█████████████████▉ | 8/50 [00:00<00:01, 39.21it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 24%|██████████████████████████▋ | 12/50 [00:00<00:00, 39.34it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 32%|███████████████████████████████████▌ | 16/50 [00:00<00:00, 39.31it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 40%|████████████████████████████████████████████▍ | 20/50 [00:00<00:00, 39.38it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 48%|█████████████████████████████████████████████████████▎ | 24/50 [00:00<00:00, 38.93it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 56%|██████████████████████████████████████████████████████████████▏ | 28/50 [00:00<00:00, 38.98it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 64%|███████████████████████████████████████████████████████████████████████ | 32/50 [00:00<00:00, 38.99it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 72%|███████████████████████████████████████████████████████████████████████████████▉ | 36/50 [00:00<00:00, 39.05it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 80%|████████████████████████████████████████████████████████████████████████████████████████▊ | 40/50 [00:01<00:00, 39.16it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 88%|█████████████████████████████████████████████████████████████████████████████████████████████████▋ | 44/50 [00:01<00:00, 39.20it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 96%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 48/50 [00:01<00:00, 39.18it/s](1, 480, 640) (1, 480, 640)
(1, 480, 640) (1, 480, 640)
: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:01<00:00, 39.13it/s]
-----------Acc of each classes-----------
road : 81.655560 %
sidewalk : 0.000000 %
building : 5.043014 %
wall : 0.000000 %
fence : 0.000000 %
pole : 0.000000 %
traffic light: nan %
traffic sign : nan %
vegetation : 0.000000 %
terrain : 0.000000 %
sky : 0.000000 %
person : 0.000000 %
rider : nan %
car : 0.000000 %
truck : nan %
bus : nan %
train : nan %
motorcycle : nan %
bicycle : nan %
stair : 0.003421 %
curb : 0.000000 %
ramp : nan %
runway : nan %
flowerbed : 0.000000 %
door : 0.000000 %
CCTV camera : 0.000000 %
Manhole : nan %
hydrant : nan %
belt : nan %
dustbin : nan %
-----------IoU of each classes-----------
road : 20.728859 %
sidewalk : 0.000000 %
building : 4.824876 %
wall : 0.000000 %
fence : 0.000000 %
pole : 0.000000 %
traffic light: 0.000000 %
traffic sign : nan %
vegetation : 0.000000 %
terrain : 0.000000 %
sky : 0.000000 %
person : 0.000000 %
rider : 0.000000 %
car : 0.000000 %
truck : 0.000000 %
bus : 0.000000 %
train : 0.000000 %
motorcycle : 0.000000 %
bicycle : 0.000000 %
stair : 0.003399 %
curb : 0.000000 %
ramp : 0.000000 %
runway : 0.000000 %
flowerbed : 0.000000 %
door : 0.000000 %
CCTV camera : 0.000000 %
Manhole : 0.000000 %
hydrant : 0.000000 %
belt : 0.000000 %
dustbin : 0.000000 %
-----------FWIoU of each classes-----------
road : 4.382773 %
sidewalk : 0.000000 %
-----------freq of each classes-----------
road : 21.143340 %
sidewalk : 16.498009 %
building : 34.396249 %
wall : 0.519759 %
fence : 0.032960 %
pole : 0.924427 %
traffic light: 0.000000 %
traffic sign : 0.000000 %
vegetation : 15.705082 %
terrain : 0.970381 %
sky : 4.972848 %
person : 0.000989 %
rider : 0.000000 %
car : 1.175906 %
truck : 0.000000 %
bus : 0.000000 %
train : 0.000000 %
motorcycle : 0.000000 %
bicycle : 0.000000 %
stair : 1.532379 %
curb : 1.699381 %
ramp : 0.000000 %
runway : 0.000000 %
flowerbed : 0.143163 %
door : 0.283993 %
CCTV camera : 0.001134 %
Manhole : 0.000000 %
hydrant : 0.000000 %
belt : 0.000000 %
dustbin : 0.000000 %
CPA:0.05418874686914789, mIoU:0.008812804591714955, fwIoU: 0.060424015451444865
[2024-06-03 16:24:27,384] task_evaluation.py(69) [INFO] - front_semantic_segamentation_model scores: {'accuracy': 0.016060831714296262}
[2024-06-03 16:24:27,384] task_evaluation.py(69) [INFO] - garden_semantic_segamentation_model scores: {'accuracy': 0.005019460125166828}
[2024-06-03 16:24:27,385] lifelong_learning.py(449) [INFO] - Task evaluation finishes.
[2024-06-03 16:24:27,386] lifelong_learning.py(452) [INFO] - upload kb index from index.pkl to /home/icyfeather/project/ianvs/workspace/lifelong_learning_bench/robot-workspace-test/benchmarkingjob/rfnet_lifelong_learning/803ef584-2182-11ef-a88f-e7cf327eae9a/output/eval/0/index.pkl
[2024-06-03 16:24:27,386] lifelong_learning.py(208) [INFO] - train from round 0
[2024-06-03 16:24:27,386] lifelong_learning.py(209) [INFO] - test round 5
[2024-06-03 16:24:27,386] lifelong_learning.py(210) [INFO] - all scores: {'accuracy': 0.008812804591714955}
[2024-06-03 16:24:27,386] lifelong_learning.py(220) [INFO] - front_semantic_segamentation_model scores: {'accuracy': 0.016060831714296262}
[2024-06-03 16:24:27,386] lifelong_learning.py(220) [INFO] - garden_semantic_segamentation_model scores: {'accuracy': 0.005019460125166828}
[2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - all scores: [[{'accuracy': 0.011080875212785671}, {'accuracy': 0.014276169299325306}, {'accuracy': 0.0100682449118572}, {'accuracy': 0.009086123410847542}, {'accuracy': 0.008812804591714955}]]
[2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - front_semantic_segamentation_model scores: [[{'accuracy': 0.011184414527967647}, {'accuracy': 0.017297397865009875}, {'accuracy': 0.015014366393296938}, {'accuracy': 0.016351156425681856}, {'accuracy': 0.016060831714296262}]]
[2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - garden_semantic_segamentation_model scores: [[{'accuracy': 0.013040300380858405}, {'accuracy': 0.014589005626878117}, {'accuracy': 0.005392195699569733}, {'accuracy': 0.0033863937183303975}, {'accuracy': 0.005019460125166828}]]
[2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - task_avg scores: [[{'accuracy': 0.012112357454413025}, {'accuracy': 0.015943201745943998}, {'accuracy': 0.010203281046433334}, {'accuracy': 0.009868775072006127}, {'accuracy': 0.010540145919731545}]]
load model url: /home/icyfeather/project/ianvs/workspace/lifelong_learning_bench/robot-workspace-test/benchmarkingjob/rfnet_lifelong_learning/803ef584-2182-11ef-a88f-e7cf327eae9a/output/train/0/seen_task/global.model
: 0%| | 0/1 [00:00<?, ?it/s][Save] save rfnet prediction: /home/icyfeather/project/ianvs/workspace/lifelong_learning_bench/robot-workspace-test/benchmarkingjob/rfnet_lifelong_learning/803ef584-2182-11ef-a88f-e7cf327eae9a/output/inference/results/1/front/00000.png_origin.png
: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 5.81it/s]
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 74, in run
res, system_metric_info = paradigm.run()
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py", line 186, in run
inference_results, unseen_task_train_samples = self._inference(
File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py", line 334, in _inference
res, is_unseen_task, _ = job.inference_2(data, **kwargs)
File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/sedna/core/lifelong_learning/lifelong_learning.py", line 597, in inference_2
seen_samples, unseen_samples = unseen_sample_recognition(
TypeError: 'NoneType' object is not callable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 54, in run_testcases
res, time = (testcase.run(workspace), utils.get_local_time())
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 79, in run
raise RuntimeError(
RuntimeError: (paradigm=lifelonglearning) pipeline runs failed, error: 'NoneType' object is not callable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 37, in main
job.run()
File "/home/icyfeather/project/ianvs/core/cmd/obj/benchmarkingjob.py", line 93, in run
succeed_testcases, test_results = self.testcase_controller.run_testcases(self.workspace)
File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 56, in run_testcases
raise RuntimeError(f"testcase(id={testcase.id}) runs failed, error: {err}") from err
RuntimeError: testcase(id=803ef584-2182-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: 'NoneType' object is not callable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/miniconda3/envs/ianvs/bin/ianvs", line 33, in <module>
sys.exit(load_entry_point('ianvs==0.1.0', 'console_scripts', 'ianvs')())
File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 41, in main
raise RuntimeError(f"benchmarkingjob runs failed, error: {err}.") from err
RuntimeError: benchmarkingjob runs failed, error: testcase(id=803ef584-2182-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: 'NoneType' object is not callable.
Other env info: sedna==0.4.1
Thanks so much! delete "os.environ['CUDA_VISIBLE_DEVICES'] = '1'" and it works.
btw, I don't know why there is "os.environ['CUDA_VISIBLE_DEVICES'] = '1'", is it necessary for some reason?
No reasons, I just forgot to delete it.
Another problem, when I run
ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
:...(many lines) CPA:0.07153843458791581, mIoU:0.005019460125166828, fwIoU: 0.03151069938676472 Found 50 test RGB images Found 50 test disparity images : 0%| | 0/50 [00:00<?, ?it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 8%|████████▉ | 4/50 [00:00<00:01, 39.04it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 16%|█████████████████▉ | 8/50 [00:00<00:01, 39.21it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 24%|██████████████████████████▋ | 12/50 [00:00<00:00, 39.34it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 32%|███████████████████████████████████▌ | 16/50 [00:00<00:00, 39.31it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 40%|████████████████████████████████████████████▍ | 20/50 [00:00<00:00, 39.38it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 48%|█████████████████████████████████████████████████████▎ | 24/50 [00:00<00:00, 38.93it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 56%|██████████████████████████████████████████████████████████████▏ | 28/50 [00:00<00:00, 38.98it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 64%|███████████████████████████████████████████████████████████████████████ | 32/50 [00:00<00:00, 38.99it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 72%|███████████████████████████████████████████████████████████████████████████████▉ | 36/50 [00:00<00:00, 39.05it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 80%|████████████████████████████████████████████████████████████████████████████████████████▊ | 40/50 [00:01<00:00, 39.16it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 88%|█████████████████████████████████████████████████████████████████████████████████████████████████▋ | 44/50 [00:01<00:00, 39.20it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 96%|██████████████████████████████████████████████████████████████████████████████████████████████████████████▌ | 48/50 [00:01<00:00, 39.18it/s](1, 480, 640) (1, 480, 640) (1, 480, 640) (1, 480, 640) : 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [00:01<00:00, 39.13it/s] -----------Acc of each classes----------- road : 81.655560 % sidewalk : 0.000000 % building : 5.043014 % wall : 0.000000 % fence : 0.000000 % pole : 0.000000 % traffic light: nan % traffic sign : nan % vegetation : 0.000000 % terrain : 0.000000 % sky : 0.000000 % person : 0.000000 % rider : nan % car : 0.000000 % truck : nan % bus : nan % train : nan % motorcycle : nan % bicycle : nan % stair : 0.003421 % curb : 0.000000 % ramp : nan % runway : nan % flowerbed : 0.000000 % door : 0.000000 % CCTV camera : 0.000000 % Manhole : nan % hydrant : nan % belt : nan % dustbin : nan % -----------IoU of each classes----------- road : 20.728859 % sidewalk : 0.000000 % building : 4.824876 % wall : 0.000000 % fence : 0.000000 % pole : 0.000000 % traffic light: 0.000000 % traffic sign : nan % vegetation : 0.000000 % terrain : 0.000000 % sky : 0.000000 % person : 0.000000 % rider : 0.000000 % car : 0.000000 % truck : 0.000000 % bus : 0.000000 % train : 0.000000 % motorcycle : 0.000000 % bicycle : 0.000000 % stair : 0.003399 % curb : 0.000000 % ramp : 0.000000 % runway : 0.000000 % flowerbed : 0.000000 % door : 0.000000 % CCTV camera : 0.000000 % Manhole : 0.000000 % hydrant : 0.000000 % belt : 0.000000 % dustbin : 0.000000 % -----------FWIoU of each classes----------- road : 4.382773 % sidewalk : 0.000000 % -----------freq of each classes----------- road : 21.143340 % sidewalk : 16.498009 % building : 34.396249 % wall : 0.519759 % fence : 0.032960 % pole : 0.924427 % traffic light: 0.000000 % traffic sign : 0.000000 % vegetation : 15.705082 % terrain : 0.970381 % sky : 4.972848 % person : 0.000989 % rider : 0.000000 % car : 1.175906 % truck : 0.000000 % bus : 0.000000 % train : 0.000000 % motorcycle : 0.000000 % bicycle : 0.000000 % stair : 1.532379 % curb : 1.699381 % ramp : 0.000000 % runway : 0.000000 % flowerbed : 0.143163 % door : 0.283993 % CCTV camera : 0.001134 % Manhole : 0.000000 % hydrant : 0.000000 % belt : 0.000000 % dustbin : 0.000000 % CPA:0.05418874686914789, mIoU:0.008812804591714955, fwIoU: 0.060424015451444865 [2024-06-03 16:24:27,384] task_evaluation.py(69) [INFO] - front_semantic_segamentation_model scores: {'accuracy': 0.016060831714296262} [2024-06-03 16:24:27,384] task_evaluation.py(69) [INFO] - garden_semantic_segamentation_model scores: {'accuracy': 0.005019460125166828} [2024-06-03 16:24:27,385] lifelong_learning.py(449) [INFO] - Task evaluation finishes. [2024-06-03 16:24:27,386] lifelong_learning.py(452) [INFO] - upload kb index from index.pkl to /home/icyfeather/project/ianvs/workspace/lifelong_learning_bench/robot-workspace-test/benchmarkingjob/rfnet_lifelong_learning/803ef584-2182-11ef-a88f-e7cf327eae9a/output/eval/0/index.pkl [2024-06-03 16:24:27,386] lifelong_learning.py(208) [INFO] - train from round 0 [2024-06-03 16:24:27,386] lifelong_learning.py(209) [INFO] - test round 5 [2024-06-03 16:24:27,386] lifelong_learning.py(210) [INFO] - all scores: {'accuracy': 0.008812804591714955} [2024-06-03 16:24:27,386] lifelong_learning.py(220) [INFO] - front_semantic_segamentation_model scores: {'accuracy': 0.016060831714296262} [2024-06-03 16:24:27,386] lifelong_learning.py(220) [INFO] - garden_semantic_segamentation_model scores: {'accuracy': 0.005019460125166828} [2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - all scores: [[{'accuracy': 0.011080875212785671}, {'accuracy': 0.014276169299325306}, {'accuracy': 0.0100682449118572}, {'accuracy': 0.009086123410847542}, {'accuracy': 0.008812804591714955}]] [2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - front_semantic_segamentation_model scores: [[{'accuracy': 0.011184414527967647}, {'accuracy': 0.017297397865009875}, {'accuracy': 0.015014366393296938}, {'accuracy': 0.016351156425681856}, {'accuracy': 0.016060831714296262}]] [2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - garden_semantic_segamentation_model scores: [[{'accuracy': 0.013040300380858405}, {'accuracy': 0.014589005626878117}, {'accuracy': 0.005392195699569733}, {'accuracy': 0.0033863937183303975}, {'accuracy': 0.005019460125166828}]] [2024-06-03 16:24:27,386] lifelong_learning.py(234) [INFO] - task_avg scores: [[{'accuracy': 0.012112357454413025}, {'accuracy': 0.015943201745943998}, {'accuracy': 0.010203281046433334}, {'accuracy': 0.009868775072006127}, {'accuracy': 0.010540145919731545}]] load model url: /home/icyfeather/project/ianvs/workspace/lifelong_learning_bench/robot-workspace-test/benchmarkingjob/rfnet_lifelong_learning/803ef584-2182-11ef-a88f-e7cf327eae9a/output/train/0/seen_task/global.model : 0%| | 0/1 [00:00<?, ?it/s][Save] save rfnet prediction: /home/icyfeather/project/ianvs/workspace/lifelong_learning_bench/robot-workspace-test/benchmarkingjob/rfnet_lifelong_learning/803ef584-2182-11ef-a88f-e7cf327eae9a/output/inference/results/1/front/00000.png_origin.png : 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 5.81it/s] Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 74, in run res, system_metric_info = paradigm.run() File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py", line 186, in run inference_results, unseen_task_train_samples = self._inference( File "/home/icyfeather/project/ianvs/core/testcasecontroller/algorithm/paradigm/lifelong_learning/lifelong_learning.py", line 334, in _inference res, is_unseen_task, _ = job.inference_2(data, **kwargs) File "/home/icyfeather/miniconda3/envs/ianvs/lib/python3.9/site-packages/sedna/core/lifelong_learning/lifelong_learning.py", line 597, in inference_2 seen_samples, unseen_samples = unseen_sample_recognition( TypeError: 'NoneType' object is not callable The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 54, in run_testcases res, time = (testcase.run(workspace), utils.get_local_time()) File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcase/testcase.py", line 79, in run raise RuntimeError( RuntimeError: (paradigm=lifelonglearning) pipeline runs failed, error: 'NoneType' object is not callable The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 37, in main job.run() File "/home/icyfeather/project/ianvs/core/cmd/obj/benchmarkingjob.py", line 93, in run succeed_testcases, test_results = self.testcase_controller.run_testcases(self.workspace) File "/home/icyfeather/project/ianvs/core/testcasecontroller/testcasecontroller.py", line 56, in run_testcases raise RuntimeError(f"testcase(id={testcase.id}) runs failed, error: {err}") from err RuntimeError: testcase(id=803ef584-2182-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: 'NoneType' object is not callable The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/miniconda3/envs/ianvs/bin/ianvs", line 33, in <module> sys.exit(load_entry_point('ianvs==0.1.0', 'console_scripts', 'ianvs')()) File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 41, in main raise RuntimeError(f"benchmarkingjob runs failed, error: {err}.") from err RuntimeError: benchmarkingjob runs failed, error: testcase(id=803ef584-2182-11ef-a88f-e7cf327eae9a) runs failed, error: (paradigm=lifelonglearning) pipeline runs failed, error: 'NoneType' object is not callable.
Other env info: sedna==0.4.1
change the mode to "no-inference".
Because there is all_df.index = pd.np.arange(1, len(all_df) + 1)
in https://github.com/kubeedge/ianvs/blob/main/core/storymanager/rank/rank.py#L178 and https://github.com/kubeedge/ianvs/blob/main/core/storymanager/rank/rank.py#L208, and pd.np
is deprecated since pandas 2.0.0, so I run pip install pandas==1.5.3
and then rerun ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
This time the training succeed, however the visualization is wrong:
garden_semantic_segamentation_model BWT_score: 0.05753041692524988
garden_semantic_segamentation_model FWT_score: 0.11415881365625927
compute function: key=task_avg, matrix=[[{'accuracy': 0.0031747766166595}, {'accuracy': 0.005443421620052942}, {'accuracy': 0.0027611203296307976}, {'accuracy': 0.005215614561737811}, {'accuracy': 0.00413587684881005}], [{'accuracy': 0.057318073389181615}, {'accuracy': 0.06558768795293932}, {'accuracy': 0.04695469448853577}, {'accuracy': 0.08797596039038008}, {'accuracy': 0.08631227347076771}], [{'accuracy': 0.11823181064491628}, {'accuracy': 0.16367581346348137}, {'accuracy': 0.12000465227436473}, {'accuracy': 0.11352067253051591}, {'accuracy': 0.11416052765142247}], [{'accuracy': 0.1414002714021555}, {'accuracy': 0.22293507242494295}, {'accuracy': 0.1827968071969279}, {'accuracy': 0.14674554475958146}, {'accuracy': 0.1464861637270747}], [{'accuracy': 0.146328014414279}, {'accuracy': 0.2257937760084689}, {'accuracy': 0.22123291275411555}, {'accuracy': 0.26335538724421365}, {'accuracy': 0.24032252447865676}], [{'accuracy': 0.18097321138867067}, {'accuracy': 0.23763178846845756}, {'accuracy': 0.22446577773244217}, {'accuracy': 0.2991808964700411}, {'accuracy': 0.27705676675558427}]], type(matrix)=<class 'list'>
task_avg BWT_score: 0.04794310523353218
task_avg FWT_score: 0.11102087522106215
/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py:171: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.
all_df = all_df.append(old_df)
/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py:179: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead.
all_df.index = pd.np.arange(1, len(all_df) + 1)
/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py:209: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead.
selected_df.index = pd.np.arange(1, len(selected_df) + 1)
Traceback (most recent call last):
File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 37, in main
job.run()
File "/home/icyfeather/project/ianvs/core/cmd/obj/benchmarkingjob.py", line 96, in run
self.rank.save(succeed_testcases, test_results, output_dir=self.workspace)
File "/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py", line 263, in save
self._draw_pictures(test_cases, test_results)
File "/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py", line 219, in _draw_pictures
for key in matrix.keys():
AttributeError: 'NoneType' object has no attribute 'keys'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/icyfeather/miniconda3/envs/ianvs/bin/ianvs", line 33, in <module>
sys.exit(load_entry_point('ianvs==0.1.0', 'console_scripts', 'ianvs')())
File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 41, in main
raise RuntimeError(f"benchmarkingjob runs failed, error: {err}.") from err
RuntimeError: benchmarkingjob runs failed, error: 'NoneType' object has no attribute 'keys'.
Because there is
all_df.index = pd.np.arange(1, len(all_df) + 1)
in https://github.com/kubeedge/ianvs/blob/main/core/storymanager/rank/rank.py#L178 and https://github.com/kubeedge/ianvs/blob/main/core/storymanager/rank/rank.py#L208, andpd.np
is deprecated since pandas 2.0.0, so I runpip install pandas==1.5.3
and then rerun
ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
This time the training succeed, however the visualization is wrong:
garden_semantic_segamentation_model BWT_score: 0.05753041692524988 garden_semantic_segamentation_model FWT_score: 0.11415881365625927 compute function: key=task_avg, matrix=[[{'accuracy': 0.0031747766166595}, {'accuracy': 0.005443421620052942}, {'accuracy': 0.0027611203296307976}, {'accuracy': 0.005215614561737811}, {'accuracy': 0.00413587684881005}], [{'accuracy': 0.057318073389181615}, {'accuracy': 0.06558768795293932}, {'accuracy': 0.04695469448853577}, {'accuracy': 0.08797596039038008}, {'accuracy': 0.08631227347076771}], [{'accuracy': 0.11823181064491628}, {'accuracy': 0.16367581346348137}, {'accuracy': 0.12000465227436473}, {'accuracy': 0.11352067253051591}, {'accuracy': 0.11416052765142247}], [{'accuracy': 0.1414002714021555}, {'accuracy': 0.22293507242494295}, {'accuracy': 0.1827968071969279}, {'accuracy': 0.14674554475958146}, {'accuracy': 0.1464861637270747}], [{'accuracy': 0.146328014414279}, {'accuracy': 0.2257937760084689}, {'accuracy': 0.22123291275411555}, {'accuracy': 0.26335538724421365}, {'accuracy': 0.24032252447865676}], [{'accuracy': 0.18097321138867067}, {'accuracy': 0.23763178846845756}, {'accuracy': 0.22446577773244217}, {'accuracy': 0.2991808964700411}, {'accuracy': 0.27705676675558427}]], type(matrix)=<class 'list'> task_avg BWT_score: 0.04794310523353218 task_avg FWT_score: 0.11102087522106215 /home/icyfeather/project/ianvs/core/storymanager/rank/rank.py:171: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead. all_df = all_df.append(old_df) /home/icyfeather/project/ianvs/core/storymanager/rank/rank.py:179: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead. all_df.index = pd.np.arange(1, len(all_df) + 1) /home/icyfeather/project/ianvs/core/storymanager/rank/rank.py:209: FutureWarning: The pandas.np module is deprecated and will be removed from pandas in a future version. Import numpy directly instead. selected_df.index = pd.np.arange(1, len(selected_df) + 1) Traceback (most recent call last): File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 37, in main job.run() File "/home/icyfeather/project/ianvs/core/cmd/obj/benchmarkingjob.py", line 96, in run self.rank.save(succeed_testcases, test_results, output_dir=self.workspace) File "/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py", line 263, in save self._draw_pictures(test_cases, test_results) File "/home/icyfeather/project/ianvs/core/storymanager/rank/rank.py", line 219, in _draw_pictures for key in matrix.keys(): AttributeError: 'NoneType' object has no attribute 'keys' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/icyfeather/miniconda3/envs/ianvs/bin/ianvs", line 33, in <module> sys.exit(load_entry_point('ianvs==0.1.0', 'console_scripts', 'ianvs')()) File "/home/icyfeather/project/ianvs/core/cmd/benchmarking.py", line 41, in main raise RuntimeError(f"benchmarkingjob runs failed, error: {err}.") from err RuntimeError: benchmarkingjob runs failed, error: 'NoneType' object has no attribute 'keys'.
The visualization part may have some bugs. You could use the "selected_and_all" mode.
Yes it works!
However after the rank result, the program just stuck, I wonder why.
+------+--------------------+---------------------+--------------------+---------------------+---------------------+----------+-----------+-----------------+-----------------+-------------------------+------------------+-------------------------+-------------------------+------+-----+
| rank | algorithm | accuracy | task_avg_acc | BWT | FWT | paradigm | basemodel | task_definition | task_allocation | basemodel-learning_rate | basemodel-epochs | task_definition-origins | task_allocation-origins | time | url |
+------+--------------------+---------------------+--------------------+---------------------+---------------------+----------+-----------+-----------------+-----------------+-------------------------+------------------+-------------------------+-------------------------+------+-----+
| 1 | | 0.16341213386768544 | 0.1823223652551931 | 0.04785572989305128 | 0.09202275672186552 | | | | | | | | | | |
| 2 | 0.0 | 0.1630332304817737 | 0.0444330578799653 | 0.1057784033217158 | | | | | | | | | | | |
| 3 | 0.0436096102184875 | 0.1029848098592248 | 0.1632288012257678 | 0.0 | | | | | | | | | | | |
+------+--------------------+---------------------+--------------------+---------------------+---------------------+----------+-----------+-----------------+-----------------+-------------------------+------------------+-------------------------+-------------------------+------+-----+
[2024-06-04 18:13:50,011] benchmarking.py(39) [INFO] - benchmarkingjob runs successfully.
(stuck here)
Yes it works!
However after the rank result, the program just stuck, I wonder why.
+------+--------------------+---------------------+--------------------+---------------------+---------------------+----------+-----------+-----------------+-----------------+-------------------------+------------------+-------------------------+-------------------------+------+-----+ | rank | algorithm | accuracy | task_avg_acc | BWT | FWT | paradigm | basemodel | task_definition | task_allocation | basemodel-learning_rate | basemodel-epochs | task_definition-origins | task_allocation-origins | time | url | +------+--------------------+---------------------+--------------------+---------------------+---------------------+----------+-----------+-----------------+-----------------+-------------------------+------------------+-------------------------+-------------------------+------+-----+ | 1 | | 0.16341213386768544 | 0.1823223652551931 | 0.04785572989305128 | 0.09202275672186552 | | | | | | | | | | | | 2 | 0.0 | 0.1630332304817737 | 0.0444330578799653 | 0.1057784033217158 | | | | | | | | | | | | | 3 | 0.0436096102184875 | 0.1029848098592248 | 0.1632288012257678 | 0.0 | | | | | | | | | | | | +------+--------------------+---------------------+--------------------+---------------------+---------------------+----------+-----------+-----------------+-----------------+-------------------------+------------------+-------------------------+-------------------------+------+-----+ [2024-06-04 18:13:50,011] benchmarking.py(39) [INFO] - benchmarkingjob runs successfully. (stuck here)
This is normal, so just ctrl c to exit.
I have successfully go through the semantic-segementation lifelong learning example.
Share my environment here:
OS and CUDA:
Cuda 11.8
ubuntu 20.04
pip list:
Package Version Editable project location
------------------------- ------------ -----------------------------------------
absl-py 2.1.0
addict 2.4.0
asgiref 3.8.1
asttokens 2.4.1
attrs 23.2.0
backcall 0.2.0
beautifulsoup4 4.12.3
bleach 6.1.0
certifi 2024.6.2
charset-normalizer 3.3.2
click 8.1.7
cmake 3.25.0
colorlog 4.7.2
contourpy 1.2.1
cycler 0.12.1
decorator 5.1.1
defusedxml 0.7.1
docopt 0.6.2
executing 2.0.1
fastapi 0.68.2
fastjsonschema 2.19.1
filelock 3.13.1
fonttools 4.53.0
fsspec 2024.5.0
grpcio 1.64.0
h11 0.14.0
huggingface-hub 0.23.2
ianvs 0.1.0
idna 3.7
importlib_metadata 7.1.0
importlib_resources 6.4.0
ipython 8.12.3
jedi 0.19.1
Jinja2 3.1.3
joblib 1.2.0
jsonschema 4.22.0
jsonschema-specifications 2023.12.1
jupyter_client 8.6.2
jupyter_core 5.7.2
jupyterlab_pygments 0.3.0
kiwisolver 1.4.5
lit 15.0.7
Markdown 3.6
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.9.0
matplotlib-inline 0.1.7
mdurl 0.1.2
minio 7.0.4
mistune 3.0.2
mmcv 2.0.1
mmdet 3.1.0 /home/icyfeather/project/mmdetection
mmengine 0.10.4
mpmath 1.3.0
nbclient 0.10.0
nbconvert 7.16.4
nbformat 5.10.4
networkx 3.2.1
numpy 1.26.4
opencv-python 4.9.0.80
packaging 24.0
pandas 1.5.3
pandocfilters 1.5.1
parso 0.8.4
pexpect 4.9.0
pickleshare 0.7.5
pillow 10.3.0
pip 24.0
pipreqs 0.5.0
platformdirs 4.2.2
prettytable 2.5.0
prompt_toolkit 3.0.45
protobuf 5.27.0
ptyprocess 0.7.0
pure-eval 0.2.2
pycocotools 2.0.7
pydantic 1.10.15
Pygments 2.18.0
pyparsing 3.1.2
python-dateutil 2.9.0.post0
pytz 2024.1
PyYAML 6.0.1
pyzmq 26.0.3
referencing 0.35.1
regex 2024.5.15
requests 2.32.3
rich 13.7.1
rpds-py 0.18.1
safetensors 0.4.3
scikit-learn 1.5.0
scipy 1.13.1
sedna 0.4.1
segment-anything 1.0 /home/icyfeather/project/segment-anything
setuptools 54.2.0
shapely 2.0.4
six 1.15.0
soupsieve 2.5
stack-data 0.6.3
starlette 0.14.2
sympy 1.12
tenacity 8.0.1
tensorboard 2.16.2
tensorboard-data-server 0.7.2
termcolor 2.4.0
terminaltables 3.1.10
threadpoolctl 3.5.0
tinycss2 1.3.0
tokenizers 0.19.1
tomli 2.0.1
torch 2.0.1+cu118
torchaudio 2.0.2+cu118
torchvision 0.15.2+cu118
tornado 6.4
tqdm 4.66.4
traitlets 5.14.3
transformers 4.41.2
triton 2.0.0
typing_extensions 4.12.1
tzdata 2024.1
urllib3 2.2.1
uvicorn 0.14.0
wcwidth 0.2.13
webencodings 0.5.1
websockets 9.1
Werkzeug 3.0.3
wheel 0.43.0
yapf 0.40.2
yarg 0.1.9
zipp 3.19.1
I am following semantic-segmentation README, when I running
ianvs -f examples/robot/lifelong_learning_bench/semantic-segmentation/benchmarkingjob-simple.yaml
, it shows:After searching on the Internet, I know it's probably about version conficts. But I hope there is a detailed version requirements(such as cuda version, torch version, etc.) to help me solve this.
And here is my env info:
Should I downgrade my cuda version?