robot-learning-freiburg / MM-DistillNet

PyTorch code for training MM-DistillNet for multimodal knowledge distillation. http://rl.uni-freiburg.de/research/multimodal-distill
GNU General Public License v3.0
58 stars 14 forks source link

Question in Train #12

Closed muzhaohui closed 2 years ago

muzhaohui commented 3 years ago

Hello there! First of all, thank you for your outstanding work! I have a problem when reproducing your work.

I use the model and the config you provided for training, but the results are very poor.

image

It stopped at the 21 epoch

image image

mAP is only 48. Is there a difference between the data set you use and the one provided? Because I use the model you provided (distillnet.0.pth.tar) to evaluate is even worse than this! image

So is it the wrong for the best model you provided?

liushibei commented 2 years ago

你好,我想问一下你的配置文件是什么样的,有没有对代码进行改动,我自己跑出来的mAP只有30几。

muzhaohui commented 2 years ago

你好,我想一下你的制作文件进行制作的,有没有对代码,我自己跑出来的mAP只有30几。

要注意各个教师网络是否加载,我是发现热度老师没有加载完全。需要在加载模型那个地方更改教师网络模型各层名字

muzhaohui commented 2 years ago

这种教师网络的伪标签很不准确,使用默认配置查看教师网络标注会发现很多标注在图片上是空气或者是同一个对象,将conf_threshold = 0.5会好一点,但还是存在问题。对了这篇论文代码的MTA损失也有问题,正常来说KL散度损失不会是负的,改成下图就会好很多: image 我的一个改进版本:修改模型结构用EfficientV2+各种数据增广+一点小trick,目前结果最好的是下图:。 image 最终我是换了个数据集、换了个框架放弃改进这篇论文,因为感觉没有创新性,目前主流自动驾驶这种识别还是激光雷达比较多。建议想尝试改进这篇论文的先搞定标签问题(比如提出一个教师网络生成伪标签算法),不然就算是提升了性能,也不能确定是不是真的有效果。

liushibei commented 2 years ago

你好,我想一下你的制作文件进行制作的,没有代码,我自己跑出来的mAP只有30几。

注意是加载模型不同的教师网络名称是否完全加载,我需要在加载模型的地方更改教师网络名称。

我查看了我的log文件,我的热教师模型似乎并未加载错误: using path=trained_models/yet-another-efficientdet-d2-thermal.pth ModelDict Update:1076/1076 但很奇怪的是,一旦我使用了热的教师,最终得到的精度就会下降很多。 只是用RGB和深度教师: QQ截图20220420105802 使用RGB、深度和热教师: a

如果我的模型加载没有问题,我不知道问题出在了什么地方,是数据加载、配置还是代码有问题。 此外,我发现热图的输入都是0,因此热的教师的预测结果都为空,我对代码进行了修改,虽然输入有数据,但是最终训练得到的结果还不如输入为0时的mAP高。 QQ截图20220420105145 c

这是我的config文件: 12_t_yes

我目前实在找不出问题所在,因此寻求你的帮助,希望至少可以将这篇论文的代码复现。

muzhaohui commented 2 years ago

我目前实在找不出问题所在,因此寻求你的帮助,希望至少可以将这篇论文的代码复现。

想起来热图输入那边也有问题,需要改变tensor的数据格式。下面是我修改后的数据处理部分代码

    thermal = None
    if self.use_thermal:
        thermal = cv2.imread(thermal_path, cv2.IMREAD_ANYDEPTH)
        if thermal is None:
            print(f"thermal={thermal_path}")
        thermal = thermal[:, self.crop_left : self.crop_right].astype(np.float32)
        # print(thermal.shape)
        # thermal = thermal * 256
        thermal = thermal * 255
        # normalize IR data
        # (is in range 0, 2**16 --> crop to relevant range(20800, 27000))
        thermal[thermal < self.ir_minval] = self.ir_minval
        thermal[thermal > self.ir_maxval] = self.ir_maxval
        thermal = (thermal - self.ir_minval) / (self.ir_maxval - self.ir_minval)
        thermal = cv2.normalize(
            thermal, np.zeros(thermal.shape), 0, 255, cv2.NORM_MINMAX
        )
        thermal = thermal.astype(np.float32)
        # thermal_image = thermal.copy()
        # # save thermal jpg

        # plt.imshow(thermal_image, cmap="plasma")
        # plt.axis("off")
        # # plt.show()
        # plt.savefig(
        #     f"out/thermal/{item}_thermal.jpg", bbox_inches="tight", pad_inches=0.0
        # )
liushibei commented 2 years ago

我目前实在找不出问题所在,因此寻求你的帮助,希望至少可以将这篇论文的代码复现。

想起来热图输入那边也有问题,需要改变tensor的数据格式。下面是我修改后的数据处理部分代码

好的,感谢你的帮助。

Shiming94 commented 1 year ago

这种教师网络的伪标签很不准确,使用默认配置查看教师网络标注会发现很多标注在图片上是空气或者是同一个对象,将conf_threshold = 0.5会好一点,但还是存在问题。对了这篇论文代码的MTA损失也有问题,正常来说KL散度损失不会是负的,改成下图就会好很多: image 我的一个改进版本:修改模型结构用EfficientV2+各种数据增广+一点小trick,目前结果最好的是下图:。 image 最终我是换了个数据集、换了个框架放弃改进这篇论文,因为感觉没有创新性,目前主流自动驾驶这种识别还是激光雷达比较多。建议想尝试改进这篇论文的先搞定标签问题(比如提出一个教师网络生成伪标签算法),不然就算是提升了性能,也不能确定是不是真的有效果。

Hi, thank you very much for your comments. We are also trying to reproduce this paper but we also met some problems. Could you please tell me what is the dataset or methods you mentioned in this thread? We also prefer the LiDAR point cloud as input. Thank you in advance.

RST2detection commented 4 months ago

I really can't find the problem at the moment, so I'm asking for your help to at least reproduce the code of this paper.

I think there is also a problem with the heat map input, and the data format of the tensor needs to be changed. Here's the code for the data processing part I've modified

    thermal = None
    if self.use_thermal:
        thermal = cv2.imread(thermal_path, cv2.IMREAD_ANYDEPTH)
        if thermal is None:
            print(f"thermal={thermal_path}")
        thermal = thermal[:, self.crop_left : self.crop_right].astype(np.float32)
        # print(thermal.shape)
        # thermal = thermal * 256
        thermal = thermal * 255
        # normalize IR data
        # (is in range 0, 2**16 --> crop to relevant range(20800, 27000))
        thermal[thermal < self.ir_minval] = self.ir_minval
        thermal[thermal > self.ir_maxval] = self.ir_maxval
        thermal = (thermal - self.ir_minval) / (self.ir_maxval - self.ir_minval)
        thermal = cv2.normalize(
            thermal, np.zeros(thermal.shape), 0, 255, cv2.NORM_MINMAX
        )
        thermal = thermal.astype(np.float32)
        # thermal_image = thermal.copy()
        # # save thermal jpg

        # plt.imshow(thermal_image, cmap="plasma")
        # plt.axis("off")
        # # plt.show()
        # plt.savefig(
        #     f"out/thermal/{item}_thermal.jpg", bbox_inches="tight", pad_inches=0.0
        # )

Hi, I have only achieved 25 AP using the best model provided by the authors even after modifying the code for calculating the KL divergence in MTAloss.py and modifying the code for normalizing the infrared images in MultimodelDetection.py. And I retrained the audio student network with all three modalities together as a teacher on the code provided by the authors, and only achieved an AP of 45. Is this because there is something else that needs to be modified in the code provided by the authors? I don't know how I can contact you, I hope I can get your help to reproduce the results provided by the authors, is it convenient to add a contact with you, my email: 2422316893@qq.com.

lix4 commented 3 months ago

这种教师网络的伪标签很不准确,使用默认配置查看教师网络标注会发现很多标注在图片上是空气或者是同一个对象,将conf_threshold = 0.5会好一点,但还是存在问题。对了这篇论文代码的MTA损失也有问题,正常来说KL散度损失不会是负的,改成下图就会好很多: image 我的一个改进版本:修改模型结构用EfficientV2+各种数据增广+一点小trick,目前结果最好的是下图:。 image 最终我是换了个数据集、换了个框架放弃改进这篇论文,因为感觉没有创新性,目前主流自动驾驶这种识别还是激光雷达比较多。建议想尝试改进这篇论文的先搞定标签问题(比如提出一个教师网络生成伪标签算法),不然就算是提升了性能,也不能确定是不是真的有效果。

你好,我想请问你最后有至少解决teacher网络标注错误的问题吗?我目前的teacher网络也是bbox漂浮在物体上方。我把conf_threshold调成了0.5但是evalutate还是一样,是要调完之后重新训练吗?谢谢! Hello, I would like to ask if you managed to solve at least the teacher network's issue? My current teacher net also produces bounding box floating above the actual object. I set conf_threshold to 0.5 but get the same result for evaluation. Do I have to set it and retrain? Thank you!