Open JadenLy opened 2 years ago
@JadenLy This replacement is in my plans as well. But it will need some accuracy alignments. And I will need to know what impact it will have on TorchVision version requirement, and mixed precision training. It may not happen very soon. Do you have some preliminary results of replacing it that you can share?
@voldemortX Thanks for your response. Since it is a class project for me, I can try and figure out the implementation then as long as the training process looks okay (e.g. loss decreasing steadily), I can call it done. But I do not have enough computational power to fully verify the impact. I can provide you with my implementation once I have it and I would appreciate if you can provide some suggestion as to my implementation.
@JadenLy That sounds great! Feel free to make a pull request once you have the implementation.
Hi @voldemortX Just a question about the dimension as I am having difficulty trying to make my implementation work. My current implementation on FFF module (designed to have the exact same functionality as your module) is as follows:
class FeatureFlipFusion(nn.Module):
def __init__(self, channels, kernel_size=(3, 3), groups=1, deform_groups=1) -> None:
super().__init__()
self.channels = channels
self.kernel_size = kernel_size
self.conv = Conv2DBlock(channels, channels, relu=False, kernel_size=1, padding=0)
self.norm = nn.BatchNorm2d(channels)
self.conv_offset = nn.Conv2d(
channels * 2,
deform_groups * 3 * self.kernel_size[0] * self.kernel_size[1],
kernel_size=self.kernel_size,
padding=1,
bias=True)
self.weight = nn.Parameter(torch.Tensor(channels, channels // groups, *kernel_size))
self.bias = nn.Parameter(torch.Tensor(channels))
self.init_weights()
def init_weights(self):
self.conv_offset.weight.data.zero_()
self.conv_offset.bias.data.zero_()
n = self.channels
for k in self.kernel_size:
n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
self.bias.data.zero_()
def forward(self, x):
flip = x.flip(-1) # 256 * 23 * 40
x = self.conv(x) # 256 * 23 * 40
# deformable
concat = torch.cat([flip, x], dim=1) # 512 * 23 * 40
out = self.conv_offset(concat) # 27 * 23 * 40
o1, o2, mask = torch.chunk(out, 3, dim=1)
offset = torch.cat((o1, o2), dim=1) # 18 * 23 * 40
mask = torch.sigmoid(mask) # 9 * 23 * 40
flip = deform_conv2d(flip, offset, self.weight, self.bias, mask=mask)
return F.relu(self.norm(flip) + x)
I try to recover your input from mmcv code for padding, stride etc. I also include the dimension of the variable after the function call inline above excluding the batch size. Conv2DBlock
is a class to run conv and bn, which I believe should be correct. The problem is that it looks like deform_conv2d
needs the offset to be 21 * 38, but the self.conv_offset
function I am using does not change the dimension here. So I wonder if you have any suggestion on the implementation here. Let me know. Thanks!
@JadenLy Great work! I''ll look into it soon.
I try to recover your input from mmcv code for padding, stride etc. I also include the dimension of the variable after the function call inline above excluding the batch size.
Conv2DBlock
is a class to run conv and bn, which I believe should be correct. The problem is that it looks likedeform_conv2d
needs the offset to be 21 * 38, but theself.conv_offset
function I am using does not change the dimension here. So I wonder if you have any suggestion on the implementation here. Let me know. Thanks!
@JadenLy I think you should add padding=(1,1)
in flip = deform_conv2d(xxx, padding=(1,1))
Thanks, I think your solution resolves the error!
Another question I have is that, for the curves predicted as in https://github.com/voldemortX/pytorch-auto-drive/blob/master/utils/models/lane_detection/bezier_lane_net.py#L68 and used in https://github.com/voldemortX/pytorch-auto-drive/blob/master/utils/losses/hungarian_bezier_loss.py#L49. In my experiment, I found that the curves returned from the model have dim of [8, 22, 2], where the curves used for loss needs the dim to be [44, 4, 2]. After some debugging, I managed to make it work by modifying the reshape part from model output to be
curves.permute(0, 2, 1).reshape(curves.shape[0], -1, curves.shape[-1] // 2, 2).contiguous()
I wonder if you have encountered such error or if there is a mismatch in the package version that I am using (latest torch) that causes the error. Thanks!
Also I wonder if you have the training loss after 400 epochs for TuSimple dataset for BezierLaneNet, just a rough number would be pretty helpful!
Another question I have is that, for the curves predicted as in https://github.com/voldemortX/pytorch-auto-drive/blob/master/utils/models/lane_detection/bezier_lane_net.py#L68 and used in https://github.com/voldemortX/pytorch-auto-drive/blob/master/utils/losses/hungarian_bezier_loss.py#L49. In my experiment, I found that the curves returned from the model have dim of [8, 22, 2], where the curves used for loss needs the dim to be [44, 4, 2]. After some debugging, I managed to make it work by modifying the reshape part from model output to be
curves.permute(0, 2, 1).reshape(curves.shape[0], -1, curves.shape[-1] // 2, 2).contiguous()
I use torch1.6 by default in training, and torch1.8 for onnx conversion tests. I have not experienced this issue, it seems your permute happened in-place or something. Maybe check the torch release notes for this? I'll mark this as a possible bug.
Also I wonder if you have the training loss after 400 epochs for TuSimple dataset for BezierLaneNet, just a rough number would be pretty helpful!
FYI, the whole training loss is around 0.025, and the curve loss is around 0.0075. The tensorboard logs corresponding to the best resnet18 & resnet34 models are here: bezier_loss.zip
Hi @voldemortX , I was able to fully train the model with the deform_conv2d
module from PyTorch. I implemented the model and some other functions while use some of your code. After 200 epochs with the default parameters you have, I got an accuracy of 88.67, while FPR 0.31 and FNR 0.21 are both relatively high. I examined some images and found samples with bad curve fitting or extra lane. So I would suggest performing the training on your end to see if my result is simply from the change or there is something I implemented wrong. Thanks
@JadenLy Can you open a pull request with your dcn implementation, and I will test if it aligns with the mmcv one?
Hello,
I am doing a school project and came across your paper on BezierLaneNet. I really like the idea and I am trying to implement it. One thing I do notice is that you opted to use
mmcv
for deformable convolution layer. I wonder if you think it is possible to replace it with thedeform_conv2d
layer provided by PyTorch as here. It looks likedeform_conv2d
should provide the same functionality based on the paper it cites. Feel free to give me some suggestion if you have. Due to my computation resources, I am only doing it on CPU so I hope to base the implementation in PyTorch if possible.Thanks!