robodhruv / visualnav-transformer

Official code and checkpoint release for mobile robot foundation models: GNM, ViNT, and NoMaD.
http://general-navigation-models.github.io
MIT License
425 stars 56 forks source link

Model Fine-Tuning Related Issues #10

Closed le-wei closed 5 months ago

le-wei commented 6 months ago

Hello, I am very grateful for the excellent work you have done on the visualnav-transformer and for making it available for everyone to learn and use. I am pleased to have successfully deployed Vint and Nomad. Now, I am hoping to fine-tune these according to our own robot and sensor equipment, but I have encountered the following two issues. First, I attempted to fine-tune using the values in the vint.yaml file, but the results were not ideal, possibly due to improper parameter adjustment. I wonder if you could provide a fine-tuning configuration file as mentioned in the Vint paper for my reference. Second, the Nomad weights you provided do not allow for the loading and obtaining of training parameters like the Vint weights, which may be due to a different method of saving the weights. I am curious if you could provide a set of weights that allows for the retrieval of training parameters. Thank you very much for your work.

wmh02240 commented 5 months ago

Hi, have you deployed this model on your robot? How is the effect? I deployed the NoMad model, but the effect is not good, collisions often occur。 Looking forward to your reply。 @le-wei

le-wei commented 5 months ago

Yes, I have followed the instructions to deploy the model to the robot and run it successfully, the results are somewhat different from what is mentioned in the paper. I asked for the exact parameters of the fine tuning but so far no response. If you tune a better model would also like to give suggestions. Thank you very much. @wmh02240

wmh02240 commented 5 months ago

我的硬件如下: camera:fisheye camera robot:X3PI car model:NoMad data:自己收集了一些数据训练 结果:导航时看起来有一些避障的能力,但效果远没有作者论文中提及的那么好。期待你的回复与交流。 @le-wei

le-wei commented 5 months ago

我用的松灵的底盘跑的,相机也是鱼眼的。跑起来有点左右晃动,效果没有想象中的好。感觉需要未调才行。

wmh02240 commented 5 months ago

你说的左右晃动是指没有避障,撞到障碍物吗?还是可以避开障碍物?@le-wei

le-wei commented 5 months ago

只能说有时可以避开有时避不开,而且没找到什么规律

Hwh0865 commented 5 months ago

哥,你好,我准备微调ViNT,论文中提到对于图像目标导向的模型微调,采用了与 ViNT 相同的训练流程,使用了学习率为 0.0001 的 AdamW 优化器,但没有采用预热(warmup)或余弦调度器(cosine scheduler),并且在在微调训练中没有混合任何先前的数据。这个意思是采集新的数据然后注释掉warmup和cosine scheduler相关代码,直接训练就好了么?@le-wei

wmh02240 commented 5 months ago

NoMad你测试没?避障效果好不? @Hwh0865

le-wei commented 5 months ago

@wmh02240 warmup设置成False,是的这样可以训练的,效果还得调试

Hwh0865 commented 5 months ago

能不能加一个联系方式 @le-wei @wmh02240 wx:15672710865

Hwh0865 commented 5 months ago

NoMad你测试没?避障效果好不? @Hwh0865

没测试过,一直测试的是ViNT

Hwh0865 commented 5 months ago

@le-wei 微调是怎么重新采集数据的,收集到代码里的那种数据格式【0.jpg,...,x.jpg,traj_data.pkl:{'position':...,'yaw':...}】,能不能指导一下,非常感谢

le-wei commented 5 months ago

@Hwh0865 直接订阅image主题和odom主题。然后修改提高的处理bag的代码就行

wmh02240 commented 5 months ago

使用自己采集的数据训练时配置文件有个参数metric_waypoint_spacing的值是怎么计算的哇?我按照readme中介绍的计算出来的值很小。 @le-wei @Hwh0865 https://github.com/robodhruv/visualnav-transformer/blob/bd3dbda18e0ade852bdedda98866bdb3e599ba72/train/vint_train/data/data_config.yaml#L61

le-wei commented 5 months ago

@wmh02240 我是计算两点之间的距离加和求平均数

wmh02240 commented 5 months ago

麻烦你看下我的计算方式有问题不?

import os
import pickle
import math

total_distance = []
data_path = "/data2/my_office_0122"
for d in os.listdir(data_path):
    with open(os.path.join(data_path, d, "traj_data.pkl"), "rb") as f:
            traj_data = pickle.load(f)
    waypoints = [tuple(i) for i in list(traj_data['position'])]  # 航路点坐标列表
    distance_all = 0
    for i in range(len(waypoints) - 1):
        x1, y1 = waypoints[i]
        x2, y2 = waypoints[i+1]
        distance = math.sqrt((x2 - x1)**2 + (y2 - y1)**2)
        distance_all += distance
    average_spacing = distance_all / (len(waypoints))
    # print(average_spacing)
    if average_spacing > 0.02:
        total_distance.append(average_spacing)

print("平均间距:", sum(total_distance)/len(total_distance), "米")

@le-wei

le-wei commented 5 months ago

@wmh02240 差不多是这样的

wmh02240 commented 5 months ago

@wmh02240 差不多是这样的

好的,谢谢。 你好,请教下在采集数据时相机帧率你设置为多少呢?有什么需要注意的细节吗?@le-wei

le-wei commented 5 months ago

@wmh02240 没有设置相机帧率,按照建议的参数就行

robodhruv commented 5 months ago

Hi everyone, thanks a lot for your interest! Unfortunately, my email notification settings were set incorrectly and I did not realize there was an active ongoing discussion for over a month :(

This thread has a few different issues, so let me try streamlining the discussion and help as best as I can.

  1. Poor collision avoidance behavior out of the box @le-wei @wmh02240 -> Let's move the discussion to this issue instead #13
  2. Clarifications regarding fine-tuning config etc. @le-wei @Hwh0865 -> Let's continue the discussion in this thread.

Big thanks to @wmh02240 and @le-wei for clarifying the training and fine-tuning details and helping out, these all seem correct.

First, I attempted to fine-tune using the values in the vint.yaml file, but the results were not ideal, possibly due to improper parameter adjustment.

What type of data are you fine-tuning the model using, and how are you evaluating the fine-tuned model? i.e., is it some offline metic or a closed-loop test that suggests that the model is not working well. @ajaysridhar0 and I can will share a new tested config for fine-tuning based on your needs, thanks for flagging this.

Second, the Nomad weights you provided do not allow for the loading and obtaining of training parameters like the Vint weights

This is unfortunate, and should be easy to fix. Will get back shortly with an updated checkpoint.

ajaysridhar0 commented 5 months ago

Second, the Nomad weights you provided do not allow for the loading and obtaining of training parameters like the Vint weights

Thank you for catching this. I fixed the weight loading for NoMaD, but please let me know if you run into any issues. https://github.com/robodhruv/visualnav-transformer/commit/b3957458a0bcc9a3501e7066e3593292ebf13321