MiniBullLab / easy_ai

3 stars 1 forks source link

test分支代码整体测试 #115

Closed foww-0001 closed 3 years ago

foww-0001 commented 3 years ago
foww-0001 commented 3 years ago

ModuleNotFoundError: No module named 'skimage'

foww-0001 commented 3 years ago

目前scikit_image==0.17.2

foww-0001 commented 3 years ago

测试ClassNet报错: 测试命令为:

./easy_tools/train_scripts/ClassNet.sh /home/wfw/data/VOCdevkit/flower_classify/ImageSets/train.txt /home/wfw/data/VOCdevkit/flower_classify/ImageSets/val.txt

报错内容如下:

Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/wfw/workspace/Test/easy_ai/easy_tools/easy_ai.py", line 12, in <module>
    from easy_tools.model_train.ai_train import EasyAiModelTrain
  File "/home/wfw/workspace/Test/easy_ai/easy_tools/model_train/ai_train.py", line 9, in <module>
    from easyai.train_task import TrainTask
  File "/home/wfw/workspace/Test/easy_ai/easyai/train_task.py", line 11, in <module>
    from easyai.tasks.utility.task_registry import REGISTERED_TRAIN_TASK
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/__init__.py", line 12, in <module>
    from . import rec_text
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/rec_text/__init__.py", line 1, in <module>
    from . import post_process
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/rec_text/post_process/__init__.py", line 2, in <module>
    from . import transformer_post_process
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/rec_text/post_process/transformer_post_process.py", line 13, in <module>
    class TransformerPostProcess(BasePostProcess):
  File "/home/wfw/workspace/Test/easy_ai/easyai/utility/registry.py", line 75, in deco
    self._register_module(cls, cls_name)
  File "/home/wfw/workspace/Test/easy_ai/easyai/utility/registry.py", line 68, in _register_module
    "{} is already registered in {}".format(cls_name, self.name)
KeyError: 'CTCPostProcess is already registered in post_process
MiniBullLab commented 3 years ago

把develop分支代码合到test上,解决bug

我来合并吗?

合并分支后问题解决。

foww-0001 commented 3 years ago

测试ClassNet报错: 测试命令为:

./easy_tools/train_scripts/ClassNet.sh /home/wfw/data/VOCdevkit/flower_classify/ImageSets/train.txt /home/wfw/data/VOCdevkit/flower_classify/ImageSets/val.txt

报错如下:

2021-07-26 17:18:42,282 ERROR   [train_task.py, 42] Traceback (most recent call last):
  File "easyai/train_task.py", line 39, in train
    task.train(self.train_path, self.val_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/cls/classify_train.py", line 25, in train
    self.build_lr_scheduler()
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/utility/common_train.py", line 73, in build_lr_scheduler
    self.lr_factory.set_epoch_iteration(self.total_batch_image)
AttributeError: 'ClassifyTrain' object has no attribute 'total_batch_image'

2021-07-26 17:18:42,282 ERROR   [train_task.py, 43] 'ClassifyTrain' object has no attribute 'total_batch_image'
lpj0822 commented 3 years ago

把develop分支代码合到test上,解决bug

foww-0001 commented 3 years ago
2021-07-27 00:24:55,278 ERROR   [train_task.py, 42] Traceback (most recent call last):
  File "easyai/train_task.py", line 39, in train
    task.train(self.train_path, self.val_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/cls/classify_train.py", line 27, in train
    self.start_train()
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/utility/common_train.py", line 126, in start_train
    EasyLogger.info("Train image count is : %d" % self.total_batch_image)
AttributeError: 'ClassifyTrain' object has no attribute 'total_batch_image'

2021-07-27 00:24:55,278 ERROR   [train_task.py, 43] 'ClassifyTrain' object has no attribute 'total_batch_image'
lpj0822 commented 3 years ago

把develop分支代码合到test上,解决bug

foww-0001 commented 3 years ago
2021-07-27 23:31:07,030 ERROR   [train_task.py, 42] Traceback (most recent call last):
  File "easyai/train_task.py", line 39, in train
    task.train(self.train_path, self.val_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/cls/classify_train.py", line 31, in train
    self.train_epoch(epoch, self.lr_scheduler, self.dataloader)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/cls/classify_train.py", line 41, in train_epoch
    for index, batch_data in enumerate(dataloader):
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/cls/classify_dataset_collate.py", line 23, in __call__
    labels = torch.stack(labels)
TypeError: expected Tensor as element 0 in argument 0, but got int

2021-07-27 23:31:07,031 ERROR   [train_task.py, 43] Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
    return self.collate_fn(data)
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/cls/classify_dataset_collate.py", line 23, in __call__
    labels = torch.stack(labels)
TypeError: expected Tensor as element 0 in argument 0, but got int
foww-0001 commented 3 years ago

训练DeNet网络报错:

2021-07-28 20:31:12,913 ERROR   [train_task.py, 42] Traceback (most recent call last):
  File "/home/wfw/workspace/Test/easy_ai/easyai/train_task.py", line 39, in train
    task.train(self.train_path, self.val_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_train.py", line 31, in train
    self.train_epoch(epoch, self.lr_scheduler, self.dataloader)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_train.py", line 43, in train_epoch
    loss_info = self.compute_backward(batch_data, i)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_train.py", line 50, in compute_backward
    loss, loss_info = self.compute_loss(output_list, batch_data)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_train.py", line 75, in compute_loss
    loss = self.model.lossList[0](output_list[0], batch_data)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/wfw/workspace/Test/easy_ai/easyai/loss/det2d/yolov3_loss.py", line 158, in forward
    targets = batch_data['label'].to(device)
AttributeError: 'list' object has no attribute 'to'

2021-07-28 20:31:12,913 ERROR   [train_task.py, 43] 'list' object has no attribute 'to'
2021-07-28 20:31:12,949 INFO    [easy_ai.py, 69] easyai process end!
lpj0822 commented 3 years ago

把develop分支代码合到test上,解决bug

foww-0001 commented 3 years ago
2021-07-29 20:01:25,174 ERROR   [train_task.py, 42] Traceback (most recent call last):
  File "/home/wfw/workspace/Test/easy_ai/easyai/train_task.py", line 39, in train
    task.train(self.train_path, self.val_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_train.py", line 34, in train
    self.test(val_path, epoch, save_model_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_train.py", line 95, in test
    mAP = self.detect_test.test(epoch)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/det2d/detect2d_test.py", line 38, in test
    for i, batch_data in enumerate(self.dataloader):
  File "/home/wfw/workspace/EDGE/edgeENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 637, in __next__
    return self._process_next_batch(batch)
  File "/home/wfw/workspace/EDGE/edgeENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
AttributeError: Traceback (most recent call last):
  File "/home/wfw/workspace/EDGE/edgeENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/wfw/workspace/EDGE/edgeENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/det2d/det2d_dataset.py", line 35, in __getitem__
    self.detect2d_class)
AttributeError: 'Det2dDataset' object has no attribute 'detect2d_class'

2021-07-29 20:01:25,174 ERROR   [train_task.py, 43] Traceback (most recent call last):
  File "/home/wfw/workspace/EDGE/edgeENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/wfw/workspace/EDGE/edgeENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/det2d/det2d_dataset.py", line 35, in __getitem__
    self.detect2d_class)
AttributeError: 'Det2dDataset' object has no attribute 'detect2d_class'
foww-0001 commented 3 years ago

训练SegNet报错:

  File "/home/wfw/workspace/Test/easy_ai/easyai/train_task.py", line 37, in train
    task = build_from_cfg(task_args, REGISTERED_TRAIN_TASK)
  File "/home/wfw/workspace/Test/easy_ai/easyai/utility/registry.py", line 109, in build_from_cfg
    return obj_cls(**args)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/seg/segment_train.py", line 26, in __init__
    self.segment_test = SegmentionTest(model_name, gpu_id, self.train_task_config)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/seg/segment_test.py", line 27, in __init__
    'num_class': len(self.test_task_config.segment_name)}
AttributeError: 'SegmentionConfig' object has no attribute 'segment_name'

2021-07-29 22:27:48,659 ERROR   [train_task.py, 43] 'SegmentionConfig' object has no attribute 'segment_name'
2021-07-29 22:27:48,668 INFO    [easy_ai.py, 69] easyai process end!
lpj0822 commented 3 years ago

把develop分支代码合到test上,解决bug

kingwangxiang commented 3 years ago

问题已解决,还没有测试。

kingwangxiang commented 3 years ago

inference和test也要测试。

foww-0001 commented 3 years ago

SegNet运行报错:

  File "/home/wfw/workspace/Test/easy_ai/easyai/train_task.py", line 39, in train
    task.train(self.train_path, self.val_path)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/seg/segment_train.py", line 35, in train
    self.train_epoch(epoch, self.lr_scheduler, self.dataloader)
  File "/home/wfw/workspace/Test/easy_ai/easyai/tasks/seg/segment_train.py", line 41, in train_epoch
    for temp_index, batch_data in enumerate(dataloader):
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 856, in _next_data
    return self._process_data(data)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 881, in _process_data
    data.reraise()
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/_utils.py", line 395, in reraise
    raise self.exc_type(msg)
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/seg/segment_dataset.py", line 43, in __getitem__
    image, target = self.data_augment.augment(image, target)
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/seg/segment_data_augment.py", line 22, in augment
    target = label[:]
TypeError: 'NoneType' object is not subscriptable

2021-08-01 21:05:24,291 ERROR   [train_task.py, 43] Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/wfw/workspace/Test/TestENV/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/seg/segment_dataset.py", line 43, in __getitem__
    image, target = self.data_augment.augment(image, target)
  File "/home/wfw/workspace/Test/easy_ai/easyai/data_loader/seg/segment_data_augment.py", line 22, in augment
    target = label[:]
TypeError: 'NoneType' object is not subscriptable
foww-0001 commented 3 years ago

训练DeNet时,发现最新合并上去的分支不再打印train的信息。

foww-0001 commented 3 years ago

目前比对了训练网络和训练代码,没有发现异常。

foww-0001 commented 3 years ago

都可以正常运行,目前需要定位精度问题。

foww-0001 commented 3 years ago

分类中的config参数中

"drop_last": true, -> "drop_last": false,

foww-0001 commented 3 years ago

分类任务运行测试无误。数据集flower2类以及flower17类上均做了测试。

kingwangxiang commented 3 years ago

测试检测和分割。