Closed helloyua closed 5 months ago
Hi, As training step is completed successfully and its scores are computed without error, so it seems that there is no issue with the code. Instead check with the dataset images, especially check that ground truth labels are single channel. If your ground truth labels are 3 channel images then you can add [:,:,0] at the end of line 118 in CD_dataset.py. Hope it helps.
Thanks for the reply and clear explanation. But it doesn't seem to be the problem with the dataset, I'm using the LEVIR dataset, which labels images as 1024 × 1024 single-channel grayscale images,After I added [:,:,0], it got the following error
0%| | 0/56 [00:00<?, ?it/s]True 0%| | 0/56 [00:00<?, ?it/s] ...... ...... ...... File "G:\elgcnet-main\elgcnet-main\datasets\CD_dataset.py", line 118, in getitem label = np.array(Image.open(L_path), dtype=np.uint8)[:,:,0] IndexError: too many indices for array: array is 2-dimensional, but 3 were indexed
Hi,
We used the pre-processed LEVIR-CD having image size of 256x256. You can either apply non-overlapped cropping on 1024x1024 sized images to get 256x256 sized images OR change the image size in main_cd.py file to 1024.
express my gratitude for your help,it works!
I really appreciate your help, However, at runtime, it seems that a V100 graphics card can hardly meet the overhead of the network. Is your device multi-card parallel? In addition, is the lightweight version mentioned in your paper also included in this project? How to use and train it? Thanks again for your patience,
I will recommend you to use 256x256 image size. You can do the non-overlap cropping of 1024x1024 sized image to obtain 16 sub-images of size 256x256. Otherwise, you can try with reduced batch size, and increase the kernel size and stride of pooling layers in attention. Additionally, you can use the decoder with less parameters from the code below:
class LightHeadDecoder(nn.Module):
"""
Transformer Decoder
"""
def __init__(self, in_channels = [32, 64, 128, 256], embedding_dim=64, output_nc=2, align_corners=True):
super(LightHeadDecoder, self).__init__()
#settings
self.align_corners = align_corners
self.in_channels = in_channels
self.embedding_dim = embedding_dim
self.output_nc = output_nc
c1_in_channels, c2_in_channels, c3_in_channels, c4_in_channels = self.in_channels
# Channel reduction of feature maps before merging
self.linear_c4 = LinearProj(input_dim=c4_in_channels, embed_dim=self.embedding_dim)
self.linear_c3 = LinearProj(input_dim=c3_in_channels, embed_dim=self.embedding_dim)
self.linear_c2 = LinearProj(input_dim=c2_in_channels, embed_dim=self.embedding_dim)
self.linear_c1 = LinearProj(input_dim=c1_in_channels, embed_dim=self.embedding_dim)
# linear fusion layer to combine mult-scale features of all stages
self.linear_fuse = nn.Sequential(
nn.Conv2d(in_channels=self.embedding_dim*len(in_channels), out_channels=self.embedding_dim, kernel_size=1, padding=0, stride=1),
nn.BatchNorm2d(self.embedding_dim)
)
self.diff_c1 = Fusion_Block(in_channels=self.embedding_dim)
self.diff_c2 = Fusion_Block(in_channels=self.embedding_dim)
self.diff_c3 = Fusion_Block(in_channels=self.embedding_dim)
self.diff_c4 = Fusion_Block(in_channels=self.embedding_dim)
#Final predction head
self.dense_2x = nn.Sequential(nn.Conv2d(in_channels=self.embedding_dim, out_channels=self.embedding_dim, kernel_size=3, padding=1, stride=1),
nn.ReLU(),
nn.BatchNorm2d(self.embedding_dim),
nn.Conv2d(in_channels=self.embedding_dim, out_channels=self.embedding_dim, kernel_size=3, padding=1, stride=1, groups=self.embedding_dim),
)
self.dense_1x = nn.Sequential(nn.Conv2d(in_channels=self.embedding_dim, out_channels=self.embedding_dim, kernel_size=3, padding=1, stride=1, groups=self.embedding_dim),
nn.ReLU(),
nn.BatchNorm2d(self.embedding_dim),
nn.Conv2d(in_channels=self.embedding_dim, out_channels=self.embedding_dim, kernel_size=1, padding=0, stride=1)
)
self.change_probability = ConvLayer(self.embedding_dim, self.output_nc, kernel_size=3, stride=1, padding=1)
#Final activation
self.active = nn.Sigmoid()
def forward(self, inputs1, inputs2):
#img1 and img2 features
c1_1, c2_1, c3_1, c4_1 = inputs1 # len=4, 1/4, 1/8, 1/16, 1/32
c1_2, c2_2, c3_2, c4_2 = inputs2 # len=4, 1/4, 1/8, 1/16, 1/32
############## MLP decoder on C1-C4 ###########
n, _, h, w = c4_1.shape
outputs = []
# Stage 4: x1/32 scale
_c4_1 = self.linear_c4(c4_1)
_c4_2 = self.linear_c4(c4_2)
_c4 = self.diff_c4([_c4_1, _c4_2])
_c4_up= resize(_c4, size=c1_2.size()[2:], mode='bilinear', align_corners=False)
# Stage 3: x1/16 scale
_c3_1 = self.linear_c3(c3_1)
_c3_2 = self.linear_c3(c3_2)
_c3 = self.diff_c3([_c3_1, _c3_2])
_c3_up= resize(_c3, size=c1_2.size()[2:], mode='bilinear', align_corners=False)
# Stage 2: x1/8 scale
_c2_1 = self.linear_c2(c2_1)
_c2_2 = self.linear_c2(c2_2)
_c2 = self.diff_c2([_c2_1, _c2_2])
_c2_up= resize(_c2, size=c1_2.size()[2:], mode='bilinear', align_corners=False)
# Stage 1: x1/4 scale
_c1_1 = self.linear_c1(c1_1)
_c1_2 = self.linear_c1(c1_2)
_c1 = self.diff_c1([_c1_1, _c1_2])
#Linear Fusion of difference image from all scales
_c = self.linear_fuse(torch.cat([_c4_up, _c3_up, _c2_up, _c1],dim=1))
#Upsampling x2 (x1/2 scale)
x = F.interpolate(_c, scale_factor=2, mode="bilinear")
#Residual block
x = x + self.dense_2x(x)
#Upsampling x2 (x1 scale)
x = F.interpolate(x, scale_factor=2, mode="bilinear")
#Residual block
x = x + self.dense_1x(x)
#Final prediction
cp = self.change_probability(x)
outputs.append(cp)
return outputs
You've been a great help,I appreciate!
Thank you for your outstanding work However, I have encountered a problem that I would like to ask you, which is that when writing the confusion matrix in the first verification step, the error shows that the index does not match.
The following is the error message: root@autodl-container-138c41aa1e-268199cb:~/elgcnet# python main_cd.py True [0] cuda:0 ================ (Wed May 8 21:12:16 2024) ================ gpu_ids: [0] project_name: elgcnet_levir checkpoint_root: ./checkpoints vis_root: ./vis num_workers: 8 dataset: CDDataset data_name: LEVIR batch_size: 32 split: train split_val: val img_size: 256 n_class: 2 dec_embed_dim: 256 pretrain: None net_G: ELGCNet loss: ce optimizer: adamw lr: 0.00031 max_epochs: 300 lr_policy: linear lr_decay_iters: [100] checkpoint_dir: ./checkpoints/elgcnet_levir vis_dir: ./vis/elgcnet_levir
training from scratch...
lr: 0.0003100
0%| | 0/14 [00:00<?, ?it/s]/root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( 7%|████████████▊ | 1/14 [00:12<02:38, 12.23s/it]Is_training: True. [0,299][1,14], imps: 138.45, est: 8.62h, G_loss: 0.65767, running_mf1: 0.47853 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 14/14 [00:37<00:00, 2.66s/it] Is_training: True. Epoch 0 / 299, epoch_mF1= 0.50163 acc: 0.85369 miou: 0.44794 mf1: 0.50163 iou_0: 0.85272 iou_1: 0.04317 F1_0: 0.92051 F1_1: 0.08276 precision_0: 0.95535 precision_1: 0.05825 recall_0: 0.88812 recall_1: 0.14289
Begin evaluation... /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( /root/miniconda3/lib/python3.8/site-packages/torchvision/transforms/functional.py:404: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( Traceback (most recent call last): File "main_cd.py", line 72, in
train(args)
File "main_cd.py", line 11, in train
model.train_models()
File "/root/elgcnet/models/trainer.py", line 332, in train_models
self._collect_running_batch_states()
File "/root/elgcnet/models/trainer.py", line 203, in _collect_running_batch_states
running_acc = self._update_metric()
File "/root/elgcnet/models/trainer.py", line 198, in _update_metric
current_score = self.running_metric.update_cm(pr=G_pred.cpu().numpy(), gt=target.cpu().numpy())
File "/root/elgcnet/misc/metric_tool.py", line 55, in update_cm
val = get_confuse_matrix(num_classes=self.n_class, label_gts=gt, label_preds=pr)
File "/root/elgcnet/misc/metric_tool.py", line 155, in get_confuse_matrix
confusion_matrix += fast_hist(lt.flatten(), lp.flatten())
File "/root/elgcnet/misc/metric_tool.py", line 150, in fast_hist
hist = np.bincount(num_classes * label_gt[mask].astype(int) + label_pred[mask],
IndexError: boolean index did not match indexed array along dimension 0; dimension is 65536 but corresponding boolean dimension is 1048576
There is another warning message: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( I sincerely hope to receive your reply