WindVChen / DRENet

The official implementation of DRENet (Degraded Reconstruction Enhancement Network) for tiny ship detection in remote sensing Images
GNU General Public License v3.0
47 stars 7 forks source link

train using different size of image #4

Open ramdhan1989 opened 2 years ago

ramdhan1989 commented 2 years ago

hi, I tried to use 417*417 images but it returns error message as follow :

File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\tqdm\std.py", line 1195, in __iter__
    for obj in iterable:
  File "C:\Users\Owner\slick\DRENet-main\utils\datasets.py", line 106, in __iter__
    yield next(self.iterator)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
    data.reraise()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\_utils.py", line 434, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\slick\DRENet-main\utils\datasets.py", line 513, in __getitem__
    img, dgimg, labels = load_mosaic(self, index)
  File "C:\Users\Owner\slick\DRENet-main\utils\datasets.py", line 709, in load_mosaic
    dgimg4[y1a:y2a, x1a:x2a] = dgimg[y1b:y2b, x1b:x2b]
ValueError: could not broadcast input array from shape (174,417,3) into shape (268,511,3)

wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing.
wandb:
wandb: Synced exp4: https://wandb.ai/rariwa/slick-project/runs/21eo5mg1
wandb: Synced 5 W&B file(s), 2 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: .\wandb\run-20220906_160742-21eo5mg1\logs

is there any lines that I need to modify or change to train with images other than 512*512?

thanks

WindVChen commented 2 years ago

Hi, have you modified the argument "img-size" in the train.py, which should be set to your input image size?

What else, if you use a size different with 512, you should also modify the configuration settings (displays as follows) of C3ResAtnMHSA modules in .yaml if there has. The parameter corresponds to the size of relative positional encoding in Multi-Head Self Attention module. You can try to modify it referring to its ratio with 512. E.g., 16 is 512 divided by 32, then you may try setting it to 417 // 32 = 13.

However, as 417 is not fully divisible by 32, the above C3ResAtnMHSA setting may still go wrong. Then you may need to set breakpoints in the code to check out the size of output features. You can also try to replace the params in the figure below (16, 32, 64) with (14, 28, 56), which is my simple deduction and not guaranteed accurate.

image

ramdhan1989 commented 2 years ago

Hi, Now I got error like this.

Traceback (most recent call last):
  File "train.py", line 515, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 84, in train
    model = Model(opt.cfg, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)  # create
  File "C:\Users\Owner\slick\DRENet-main\models\yolo.py", line 99, in __init__
    m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))[0]])  # forward
  File "C:\Users\Owner\slick\DRENet-main\models\yolo.py", line 131, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "C:\Users\Owner\slick\DRENet-main\models\yolo.py", line 148, in forward_once
    x = m(x)  # run
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\slick\DRENet-main\models\common.py", line 192, in forward
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\nn\modules\container.py", line 141, in forward
    input = module(input)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\nn\modules\module.py", line 1102, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\slick\DRENet-main\models\common.py", line 137, in forward
    energy = content_content + content_position
RuntimeError: The size of tensor a (256) must match the size of tensor b (196) at non-singleton dimension 1

please advise

thank you

WindVChen commented 2 years ago

I have revisited the code, and found that the input image size should be even strictly (or the cat dimensions in Focus module of YOLOv5 will be inconsistent). So, you should set "img-size" to 416 or other even numbers, after that, do a global ,py files search for the number 512 and change all of them to 416. Then, if use 416, you should again modify the params in .yaml from (16, 32, 64) to (13, 26, 52). It should be worked!

ramdhan1989 commented 2 years ago

Hi, finally I tried to use 512 image size and got this error :

github: skipping check (not a git repository)
YOLOv5  torch 1.10.0+cu102 CUDA:0 (Quadro RTX 5000, 16384.0MB)

Namespace(adam=False, batch_size=16, bucket='', cache_images=False, cfg='./models/DRENet-custom-512.yaml', data='./data/custom.yaml', device='0', epochs=1000, evolve=False, exist_ok=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[512, 512], local_rank=-1, log_artifacts=False, log_imgs=16, multi_scale=False, name='exp', noautoanchor=False, nosave=False, notest=False, project='./custom-project', quad=False, rect=False, resume=False, save_dir='custom-project\\exp17', single_cls=False, sync_bn=False, total_batch_size=16, weights='', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir ./custom-project", view at http://localhost:6006/
hyperparameters: lr0=0.01, lrf=0.2, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=0.05, cls=0.5, cls_pw=1.0, obj=1.0, obj_pw=1.0, iou_t=0.2, anchor_t=4.0, fl_gamma=0.0, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, mosaic=1.0, mixup=0.0
Overriding model.yaml nc=80 with nc=1

                 from  n    params  module                                  arguments
  0                -1  1      3520  models.common.Focus                     [3, 32, 3]
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]
  2                -1  1     18816  models.common.C3                        [64, 64, 1]
  3                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]
  4                -1  1    156928  models.common.C3                        [128, 128, 3]
  5                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]
  6                -1  1    625152  models.common.C3                        [256, 256, 3]
  7                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]
  8                -1  1    656896  models.common.SPP                       [512, 512, [5, 9, 13]]
  9                -1  1    646272  models.common.C3ResAtnMHSA              [512, 512, 1, 16, False]
 10                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']
 12           [-1, 6]  1         1  models.common.ConcatFusionFactor        [1]
 13                -1  1    230976  models.common.C3ResAtnMHSA              [512, 256, 1, 32, False]
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']
 16           [-1, 4]  1         1  models.common.ConcatFusionFactor        [1]
 17                -1  1     61216  models.common.C3ResAtnMHSA              [256, 128, 1, 64, False]
 18                14  1    132672  models.common.C3ResAtnMHSA              [128, 256, 1, 32, False]
 19                10  1    515200  models.common.C3ResAtnMHSA              [256, 512, 1, 16, False]
 20                 4  1   1187342  models.common.RCAN                      [128]
 21      [17, 18, 19]  1     16182  models.yolo.Detect                      [1, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ..\aten\src\ATen\native\TensorShape.cpp:2157.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
Model Summary: 334 layers, 5984422 parameters, 5984422 gradients, 18.3 GFLOPS

Scaled weight_decay = 0.0005
Optimizer groups: 83 .bias, 83 conv.weight, 63 other
wandb: Currently logged in as: rariwa. Use `wandb login --relogin` to force relogin
wandb: wandb version 0.13.3 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.12.16
wandb: Run data is saved locally in C:\Users\Owner\custom\DRENet-main\wandb\run-20220910_190920-3cw5fhez
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run exp17
wandb:  View project at https://wandb.ai/rariwa/custom-project
wandb:  View run at https://wandb.ai/rariwa/custom-project/runs/3cw5fhez
train: Scanning 'F:\custom\for_DRENet\train\labels.cache' for images and labels... 27129 found, 0 missing, 12681 empty,
val: Scanning 'F:\custom\for_DRENet\val\labels.cache' for images and labels... 6859 found, 0 missing, 3248 empty, 5 corr
Plotting labels...
Images sizes do not match. This will causes images to be display incorrectly in the UI.

autoanchor: Analyzing anchors... anchors/target = 4.03, Best Possible Recall (BPR) = 0.9636. Attempting to improve anchors, please wait...
autoanchor: WARNING: Extremely small objects found. 651 of 23624 labels are < 3 pixels in size.
autoanchor: Running kmeans for 9 anchors on 23624 points...
autoanchor: thr=0.25: 0.9626 best possible recall, 3.25 anchors past thr
autoanchor: n=9, img_size=512, metric_all=0.232/0.619-mean/best, past_thr=0.458-mean: 14,14,  21,38,  60,30,  40,85,  142,65,  82,170,  327,125,  128,345,  431,392
autoanchor: Evolving anchors with Genetic Algorithm: fitness = 0.6587: 100%|██████| 1000/1000 [00:06<00:00, 146.71it/s]
autoanchor: thr=0.25: 0.9707 best possible recall, 4.12 anchors past thr
autoanchor: n=9, img_size=512, metric_all=0.276/0.662-mean/best, past_thr=0.465-mean: 9,9,  11,20,  24,14,  20,36,  52,28,  39,76,  113,84,  127,213,  375,314
autoanchor: New anchors saved to model. Update model *.yaml to use these anchors in the future.

C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\nn\_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='mean' instead.
  warnings.warn(warning.format(ret))
Image sizes 512 train, 512 test
Using 8 dataloader workers
Logging results to custom-project\exp17
Starting training for 1000 epochs...

     Epoch   gpu_mem       box       obj       cls       dgi     total   targets  img_size
  0%|                                                                                         | 0/1694 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "train.py", line 515, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 268, in train
    for i, (imgs, dgimgs, targets, paths, _) in pbar:  # batch -------------------------------------------------------------
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\tqdm\std.py", line 1195, in __iter__
    for obj in iterable:
  File "C:\Users\Owner\custom\DRENet-main\utils\datasets.py", line 106, in __iter__
    yield next(self.iterator)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
    data.reraise()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\_utils.py", line 434, in reraise
    raise exception
AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\custom\DRENet-main\utils\datasets.py", line 513, in __getitem__
    img, dgimg, labels = load_mosaic(self, index)
  File "C:\Users\Owner\custom\DRENet-main\utils\datasets.py", line 695, in load_mosaic
    dgimg4 = np.full((s * 2, s * 2, dgimg.shape[2]), 114, dtype=np.uint8)
AttributeError: 'NoneType' object has no attribute 'shape'

wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing.
wandb:
wandb: Synced exp17: https://wandb.ai/rariwa/custom-project/runs/3cw5fhez
wandb: Synced 5 W&B file(s), 2 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: .\wandb\run-20220910_190920-3cw5fhez\logs

this is the command that I used: python train.py --cfg "./models/DRENet-custom-512.yaml" --epochs 1000 --workers 8 --batch-size 16 --device 0 --project "./custom-project" --data "./data/custom_data.yaml

Please advise

thank you

WindVChen commented 2 years ago

It seems it is caused by the missing of the matching dgimgs. You can assert the codeline assert dgimg is not None, 'DGImage Not Found ' + dgpath into line 642, datasets.py to see which is missed:

That is, from:

def load_image(self, index):
    # loads 1 image from dataset, returns img, original hw, resized hw
    img = self.imgs[index]
    if img is None:  # not cached
        path = self.img_files[index]
        dgpath = self.degrade_files[index]
        if path[-3:]=="npy":
            img = np.load(path)
        else:
            img = cv2.imread(path)  # BGR
            dgimg = cv2.imread(dgpath)
        assert img is not None, 'Image Not Found ' + path

to:

def load_image(self, index):
    # loads 1 image from dataset, returns img, original hw, resized hw
    img = self.imgs[index]
    if img is None:  # not cached
        path = self.img_files[index]
        dgpath = self.degrade_files[index]
        if path[-3:]=="npy":
            img = np.load(path)
        else:
            img = cv2.imread(path)  # BGR
            dgimg = cv2.imread(dgpath)
            assert dgimg is not None, 'DGImage Not Found ' + dgpath  #  ------------------Added codeline

        assert img is not None, 'Image Not Found ' + path
ramdhan1989 commented 2 years ago

Ok thank you so much for your answers. However, I am facing the new issue as follow :

Traceback (most recent call last):
  File "train.py", line 515, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 268, in train
    for i, (imgs, dgimgs, targets, paths, _) in pbar:  # batch -------------------------------------------------------------
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\tqdm\std.py", line 1195, in __iter__
    for obj in iterable:
  File "C:\Users\Owner\slick\DRENet-main\utils\datasets.py", line 106, in __iter__
    yield next(self.iterator)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
    data.reraise()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\_utils.py", line 434, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\slick\DRENet-main\utils\datasets.py", line 513, in __getitem__
    img, dgimg, labels = load_mosaic(self, index)
  File "C:\Users\Owner\slick\DRENet-main\utils\datasets.py", line 706, in load_mosaic
    img, dgimg, _, (h, w) = load_image(self, index)
TypeError: cannot unpack non-iterable NoneType object

wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing.
wandb:
wandb: Synced exp20: https://wandb.ai/rariwa/slick-project/runs/171zzfjg
wandb: Synced 5 W&B file(s), 2 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: .\wandb\run-20220911_234922-171zzfjg\logs

please advise,

thank you

WindVChen commented 2 years ago

I'm not sure what caused the problem. Did you modify the values returned by the function "load_image(self, index)"? Or try to delete the ".cache" files under the train/val/test directories to refresh the cache mechanism of Yolov5.

ramdhan1989 commented 2 years ago

I ensure the load image is not changed and deleted cache. here is the new error:

    Epoch   gpu_mem       box       obj       cls       dgi     total   targets  img_size
      0/99     9.37G    0.1157   0.01454         0  0.008762     2.842        11       512: 100%|█| 894/894 [09:44<00:0
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95:   0%| | 0/100 [00:00<?, ?
Traceback (most recent call last):
  File "train.py", line 515, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 346, in train
    results, maps, times = test.test(opt.data,
  File "C:\Users\Owner\slick\DRENet-main2\test.py", line 103, in test
    for batch_i, (img, dgimgs, targets, paths, shapes) in enumerate(tqdm(dataloader, desc=s)):
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\tqdm\std.py", line 1195, in __iter__
    for obj in iterable:
  File "C:\Users\Owner\slick\DRENet-main2\utils\datasets.py", line 106, in __iter__
    yield next(self.iterator)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 521, in __next__
    data = self._next_data()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1203, in _next_data
    return self._process_data(data)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\dataloader.py", line 1229, in _process_data
    data.reraise()
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\_utils.py", line 434, in reraise
    raise exception
TypeError: Caught TypeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\worker.py", line 287, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\Anaconda3\envs\sit\lib\site-packages\torch\utils\data\_utils\fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "C:\Users\Owner\slick\DRENet-main2\utils\datasets.py", line 582, in __getitem__
    dgimg = dgimg[:, :, ::-1].transpose(2, 0, 1)
TypeError: 'NoneType' object is not subscriptable

wandb: Waiting for W&B process to finish... (failed 1). Press Ctrl-C to abort syncing.
wandb:
wandb: Synced exp2: https://wandb.ai/rariwa/slick-project/runs/upi7w4gl
wandb: Synced 5 W&B file(s), 5 media file(s), 0 artifact file(s) and 0 other file(s)
wandb: Find logs at: .\wandb\run-20220912_222147-upi7w4gl\logs

is there any way to find source of error? Please advise,

thank you

WindVChen commented 2 years ago

It seems it is still caused by the missing of degraded images. I suggest to again check that each image has its corresponding degraded image in the train/val/test datasets.

ramdhan1989 commented 2 years ago

I would like to clarify, so degrade folder must be exist in each train/val/test? I only created degrade folder inside train.

WindVChen commented 2 years ago

Yes, each should comprise degrade folder. The whole structure can also refer to #2. The degraded images in val or test are for the possible use of loss calculation.

As in the practical use, we cannot refer to the target groundtruth, thus the degraded images cannot be properly generated which can lead to the above degrade images missing error. To tackle the practical use, as we didn't write a specific inference code, you may need to take some efforts to modify the codes, e.g., comment out some lines. Or, a more simple way is just to generate a series fake degraded images, such as constant value or random values.

ramdhan1989 commented 2 years ago

Hi, I just finished the training process. Now, how to load the model and do inference for each image by looping the image path? for example read the image using pil.image or cv2. I see the test.py and detect.py seems there are many steps conducted after passing the image into the network.

Please advice thank you

WindVChen commented 2 years ago

You can refer to the Test Process part in the README.md. Note that before you run the code, you should make sure you have changed the val path in ship.yaml to the test path if you want to test, not val. Then if you want to save all the detection results, please set the batch_size to 1 and the argument plot_batch_num to a big enough value (>=length of the test dataset), or you may only get the last image for each of the first plot_batch_num batches.

ramdhan1989 commented 2 years ago

Hi, I generated fake images inside test folder. However I got this error:

Fusing layers...
Model Summary: 239 layers, 4788248 parameters, 0 gradients
val: Scanning 'F:\slick\for_DRENet\test\labels.cache' for images and labels... 0 found, 39769 missing, 0 empty, 0 corru
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95:   0%| | 0/39769 [00:00<?,
Traceback (most recent call last):
  File "C:\Users\Owner\slick\DRENet-main2\test.py", line 323, in <module>
    test(opt.data,
  File "C:\Users\Owner\slick\DRENet-main2\test.py", line 115, in test
    (out, train_out), pdg = model(img, augment=augment)  # inference and training outputs
  File "C:\Users\Owner\anaconda3\envs\detectron\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\slick\DRENet-main2\models\yolo.py", line 131, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "C:\Users\Owner\slick\DRENet-main2\models\yolo.py", line 148, in forward_once
    x = m(x)  # run
  File "C:\Users\Owner\anaconda3\envs\detectron\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\slick\DRENet-main2\models\common.py", line 217, in forward
    x=torch.from_numpy(x)
TypeError: expected np.ndarray (got Tensor)

I deactivated line 217 in common.py but it result an error as follow:

Traceback (most recent call last):
  File "C:\Users\Owner\slick\DRENet-main2\test.py", line 323, in <module>
    test(opt.data,
  File "C:\Users\Owner\slick\DRENet-main2\test.py", line 115, in test
    (out, train_out), pdg = model(img, augment=augment)  # inference and training outputs
  File "C:\Users\Owner\anaconda3\envs\detectron\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\slick\DRENet-main2\models\yolo.py", line 131, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "C:\Users\Owner\slick\DRENet-main2\models\yolo.py", line 148, in forward_once
    x = m(x)  # run
  File "C:\Users\Owner\anaconda3\envs\detectron\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\anaconda3\envs\detectron\lib\site-packages\torch\nn\modules\upsampling.py", line 154, in forward
    recompute_scale_factor=self.recompute_scale_factor)
  File "C:\Users\Owner\anaconda3\envs\detectron\lib\site-packages\torch\nn\modules\module.py", line 1185, in __getattr__
    raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Upsample' object has no attribute 'recompute_scale_factor'

Please advise

thank you

WindVChen commented 2 years ago

For the first error, I actually failed to find the mentioned codeline x=torch.from_numpy(x), as in the original code, the codeline in line 217, common.py is return self.conv(torch.cat([x[..., ::2, ::2], x[..., 1::2, ::2], x[..., ::2, 1::2], x[..., 1::2, 1::2]], 1)).

Then, for the second error, it seems it is caused by the different Pytorch version, which can be refered to https://github.com/ultralytics/yolov5/issues/5499#issue-1044394003. To fix the error, you may just need to do some modifications on the upsamping.py as here.

ramdhan1989 commented 2 years ago

Thanks, for your help. It works !

ramdhan1989 commented 1 year ago

Hi @WindVChen, I apologize to bother you again. I was successful to use 512 image size. I must use image size which is 160 but it gives error as follow :

                 from  n    params  module                                  arguments
  0                -1  1      5280  models.common.Focus                     [3, 48, 3]
  1                -1  1     41664  models.common.Conv                      [48, 96, 3, 2]
  2                -1  1     65280  models.common.C3                        [96, 96, 2]
  3                -1  1    166272  models.common.Conv                      [96, 192, 3, 2]
  4                -1  1    629760  models.common.C3                        [192, 192, 6]
  5                -1  1    664320  models.common.Conv                      [192, 384, 3, 2]
  6                -1  1   2512896  models.common.C3                        [384, 384, 6]
  7                -1  1   2655744  models.common.Conv                      [384, 768, 3, 2]
  8                -1  1   1476864  models.common.SPP                       [768, 768, [5, 9, 13]]
  9                -1  1   1706112  models.common.C3ResAtnMHSA              [768, 768, 2, 5, False]
 10                -1  1    295680  models.common.Conv                      [768, 384, 1, 1]
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']
 12           [-1, 6]  1         1  models.common.ConcatFusionFactor        [1]
 13                -1  1    578496  models.common.C3ResAtnMHSA              [768, 384, 2, 10, False]
 14                -1  1     74112  models.common.Conv                      [384, 192, 1, 1]
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']
 16           [-1, 4]  1         1  models.common.ConcatFusionFactor        [1]
 17                -1  1    148320  models.common.C3ResAtnMHSA              [384, 192, 2, 20, False]
 18                14  1    357312  models.common.C3ResAtnMHSA              [192, 384, 2, 10, False]
 19                10  1   1411200  models.common.C3ResAtnMHSA              [384, 768, 2, 5, False]
 20                 4  1   2667282  models.common.RCAN                      [192]
 21      [17, 18, 19]  1     92943  models.yolo.Detect                      [18, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [192, 384, 768]]
Traceback (most recent call last):
  File "train.py", line 515, in <module>
    train(hyp, opt, device, tb_writer, wandb)
  File "train.py", line 84, in train
    model = Model(opt.cfg, ch=3, nc=nc, anchors=hyp.get('anchors')).to(device)  # create
  File "C:\Users\Owner\axes\CRAFT-pytorch-master\plot_markers\DRENet\models\yolo.py", line 99, in __init__
    m.stride = torch.tensor([s / x.shape[-2] for x in self.forward(torch.zeros(1, ch, s, s))[0]])  # forward
  File "C:\Users\Owner\axes\CRAFT-pytorch-master\plot_markers\DRENet\models\yolo.py", line 131, in forward
    return self.forward_once(x, profile)  # single-scale inference, train
  File "C:\Users\Owner\axes\CRAFT-pytorch-master\plot_markers\DRENet\models\yolo.py", line 148, in forward_once
    x = m(x)  # run
  File "C:\Users\Owner\anaconda3\envs\slick\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\axes\CRAFT-pytorch-master\plot_markers\DRENet\models\common.py", line 192, in forward
    return self.cv3(torch.cat((self.m(self.cv1(x)), self.cv2(x)), dim=1))
  File "C:\Users\Owner\anaconda3\envs\slick\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\anaconda3\envs\slick\lib\site-packages\torch\nn\modules\container.py", line 141, in forward
    input = module(input)
  File "C:\Users\Owner\anaconda3\envs\slick\lib\site-packages\torch\nn\modules\module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "C:\Users\Owner\axes\CRAFT-pytorch-master\plot_markers\DRENet\models\common.py", line 137, in forward
    energy = content_content + content_position
RuntimeError: The size of tensor a (256) must match the size of tensor b (25) at non-singleton dimension 1

This is yaml file of the model :

# parameters
nc: 18  # number of classes
depth_multiple: 0.67  # model depth multiple
width_multiple: 0.75  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, C3ResAtnMHSA, [1024, 5, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, ConcatFusionFactor, [1]],  # cat backbone P4
   [-1, 3, C3ResAtnMHSA, [512, 10, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, ConcatFusionFactor, [1]],  # cat backbone P3
   [-1, 3, C3ResAtnMHSA, [256, 20, False]],  # 17 (P3/8-small)

   [14, 3, C3ResAtnMHSA, [512, 10, False]],  # 18 (P4/16-medium)

   [10, 3, C3ResAtnMHSA, [1024, 5, False]],  # 19 (P5/32-large)

   [4, 1, RCAN, []],
   [[17, 18, 19], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

Please advise thanks

WindVChen commented 1 year ago

@ramdhan1989 Hi, the raised error is because there is an additional operation of calculating FLOPs and Params in YOLOv5.

E.g., here: https://github.com/WindVChen/DRENet/blob/3325fa6b832b127a0ec1c7e2cf122665e60ad25e/models/yolo.py#L97-L99

Thus, you should change the value of the image resolution for that calculation. See the answer here https://github.com/WindVChen/DRENet/issues/4#issuecomment-1238849102.

ramdhan1989 commented 1 year ago

@ramdhan1989 Hi, the raised error is because there is an additional operation of calculating FLOPs and Params in YOLOv5.

E.g., here:

https://github.com/WindVChen/DRENet/blob/3325fa6b832b127a0ec1c7e2cf122665e60ad25e/models/yolo.py#L97-L99

Thus, you should change the value of the image resolution for that calculation. See the answer here #4 (comment).

Thanks it is running now. However, I feel strange training performance after modifying all the default 512 with 160. using default 512 I got GFLOPS 18.1 while size 160 I only got 1.81 GFLOPS. I am not sure if it is related or not. Please advise

Thanks

WindVChen commented 1 year ago

The variation of Flops is normal, as FLOPs is positively correlated with feature map size. Therefore, the larger the input image, the larger the FLOPs.

ramdhan1989 commented 1 year ago

Ok noted, I am curious in DegradeGenerate.py

for i in range(img.shape[0]):
          for j in range(img.shape[1]):
              #print(cnt)
              #cnt+=1
              minDis = 130 * 130
              for center in centers:
                  distance = (i - center[2]) ** 2 + (j - center[1]) ** 2
                  if distance < minDis:
                      minDis = distance
              # boxSize = (int(0.05* (minDis ** 0.5))) // 2 + 1

If I reduced image size to 160, do I need to change minDis from 130 to another value ? thanks

WindVChen commented 1 year ago

This value is set to ensure not too blurry for a pixel far away from object targets. If you apply the method on high-resolution images such as <4m, the current value seems fine. If on low-resolution images such as >16m, the value should be set smaller.

Another thing you may need to pay attention is the Degrade Function design. As objects in Levir-Ship dataset are mostly under 20 pixels, the current function is designed to ensure not blurring the pixels in the object area. Therefore, you may need to modify the function to adapt to your own dataset. (About the advice of the function design, you can refer to Fig. 8 in the paper.)

ramdhan1989 commented 1 year ago

I am trying to elaborate your suggestions with the paper and your comment here. Let me show you some examples. The image size is 160*160 and the values below are normalized.

  1. image

for this case, I will put minDis 140 with assumptions pixel size roughly around 24 (160*0.15). So my boxsize would be 31 (1.03^140)/2 which is in between 24 and 40 (1/4 image size).

  1. image for this case, I will put default minDis 130 with assumptions pixel size roughly around 8 (160*0.05). So my boxsize would be 23 (1.03^130)/2 which is in between 8 and 40 (1/4 image size).

  2. image for this case, I will put default minDis 120 with assumptions pixel size roughly around 8 (160*0.05). So my boxsize would be 17 (1.03^120)/2 which is in between 8 and 40 (1/4 image size).

is my configuration above good enough? please advise

thanks

WindVChen commented 1 year ago

Sorry for my late reply. The setting of minDis seems OK. You can give it a try, and I think the difference of minDis will not affect the final result too much.

What may matter more is still the design of Degrade Function. For example, in your given Example 1, I notice that the samples cover a large range, even reach 0.6*160=96 pixels. Then, the current Function is not suitable, as from https://github.com/WindVChen/DRENet/issues/3#issuecomment-1236288998 we can see that for pixels whose distance to object target is larger than 20, the degrade operation will be applied. Thus, the large object targets in Example 1 will be partially blurred, which may lead to accuracy drop. Therefore, I suggest to modify the Function to keep more object targets clear.

While for Example 3 whose maximum size of target can only reach 0.06*160=9.6, the current Function will keep much background unblurred, so you also need to modify the function.

ramdhan1989 commented 1 year ago

for Example 1, I changed the function of kernel size to y=1.0066^x assuming the largest object is 96 pixel ~ 1.88 based on eq for example 3, I changed the function of kernel size to y=1.066^x assuming the largest object is 10 pixel ~ 1.89 based on eq does it make sense?

thanks

WindVChen commented 1 year ago

For Example 1, the current design will lead to a rather smooth curve, which may not achieve the goal of blurring the background. image Since the distance here is between one pixel to the center point of the object, for a 96X96 object, I think it is enough to just ensure pixels at a distance of about 60 (96/2*1.4) pixels from the center of the object. Maybe you can try y=1.012^x.

Example 3 design seems fine.