About the test metrics - Githubissues

huamo555 commented 7 months ago

Thank you for your outstanding work, the test results I got after running your test.py file are in the results folder of logs where there are 10 text files, text0, text1, ........ text10. What do I need to do to get the Task Success that appear in your paper? Is it averaging the correct rates from the ten files? Also I noticed that the max_episode_step of tests in the test.py file is 15, should I change it to 8 here?

huamo555 commented 7 months ago

Thank you for your outstanding work, the test results I got after running your test.py file are in the results folder of logs where there are 10 text files, text0, text1, ........ text10. What do I need to do to get the Task Success that appear in your paper? Is it averaging the correct rates from the ten files? Also I noticed that the max_episode_step of tests in the test.py file is 15, should I change it to 8 here?

I'll put case0, case1 .... .case9, in the correct rate to take the average, the final result is about 0.6, which is very different from your paper, can you tell me the correct way to calculate the metric?

yaoyao-sourse commented 7 months ago

我得到了和他相似的结果，我猜测这可能不是由于指标计算方式导致的，会不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun

xukechun commented 4 months ago

So sorry to neglect some notices of issues due to settings. Yes, the max episode step is 8, and I have reorganized our codes with some problems fixed. BTW, the headless mode may influence the results due to the different rendering ways.

ubless607 commented 4 months ago

@xukechun Is there any way to turn off the visualization of the pybullet (only working on background) for the headless mode users?

ubless607 commented 3 months ago

@xukechun, I noticed that the test_cases provided only include arrangements with seen objects in 10 cases. Do we have a test file that incorporates the 5 test cases with unseen objects?

hanyueling commented 1 week ago

我得到了和他相似的结果，我猜测这可能不是由于指标计算方式导致的，会不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun我得到了和他类似的结果，我猜测这可能不是由于指标计算方式导致的，不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun

我也得到了类似的结果，测试的平均成功率只有0.52。我是把case0到case9的成功率加起来做了平均。而且我在测试和训练时都打开了可视化界面。请问这与原文误差达到20%是什么原因导致的？

xukechun commented 1 week ago

Do you use the latest codes?

-----原始邮件----- 发件人:韩悦 @.> 发送时间:2024-09-23 22:09:28 (星期一) 收件人: xukechun/Vision-Language-Grasping @.> 抄送: kuko @.>, Mention @.> 主题: Re: [xukechun/Vision-Language-Grasping] About the test metrics (Issue #13)

我得到了和他相似的结果，我猜测这可能不是由于指标计算方式导致的，会不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun我得到了和他类似的结果，我猜测这可能不是由于指标计算方式导致的，不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun

我也得到了类似的结果，测试的平均成功率只有0.52。我是把case0到case9的成功率加起来做了平均。而且我在测试和训练时都打开了可视化界面。请问这与原文误差达到20%是什么原因导致的？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

hanyueling commented 1 week ago

Do you use the latest codes?…【谷歌翻译】远程服务器没有响应。[查看解决方案](https://hcfy.app/docs/faqs/why-services-always-loading)点击重试 -----原始邮件----- 发件人:韩悦 @.> 发送时间:2024-09-23 22:09:28 (星期一) 收件人: xukechun/Vision-Language-Grasping @.> 抄送: kuko @.>, Mention @.> 主题: Re: [xukechun/Vision-Language-Grasping] About the test metrics (Issue #13) 我得到了和他相似的结果，我猜测这可能不是由于指标计算方式导致的，会不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun我得到了和他类似的结果，我猜测这可能不是由于指标计算方式导致的，不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun 我也得到了类似的结果，测试的平均成功率只有0.52。我是把case0到case9的成功率加起来做了平均。而且我在测试和训练时都打开了可视化界面。请问这与原文误差达到20%是什么原因导致的？ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

是的，我最近就在复现您给的代码。但是我发现您给出的训练代码中是训练了5000个epsiode。但是您在论文中说用了2000个epsiode进行训练。所以我将代码改成了2000个epsiode。并且我发现在训练代码中第一阶段和第二阶段的最大抓握次数没有区分所以我将这部分代码加入进行训练。最后得到的结果进行了两次重复的测试，结果分别为0.46329和0.52662。然后我使用了您给出的checkpoint进行了测试结果仅为0.42998，0.5333。（最开始我是直接用您给的训练代码进行训练，然后测试得出的结果的成功率也只有0.40665）所以我不清楚哪里出了问题？

xukechun commented 1 week ago

I retested our provided checkpoint, and the log files are attached. The retested results are 72.0/4.12 in average, so I did not reproduce your problem yet. Could you share your log files and the configs? Maybe that will help me figure out the problem.

-----原始邮件----- 发件人:韩悦 @.> 发送时间:2024-09-23 22:48:52 (星期一) 收件人: xukechun/Vision-Language-Grasping @.> 抄送: kuko @.>, Mention @.> 主题: Re: [xukechun/Vision-Language-Grasping] About the test metrics (Issue #13)

Do you use the latest codes?…【谷歌翻译】远程服务器没有响应。查看解决方案点击重试 -----原始邮件----- 发件人:韩悦 @.> 发送时间:2024-09-23 22:09:28 (星期一) 收件人: xukechun/Vision-Language-Grasping @.> 抄送: kuko @.>, Mention @.> 主题: Re: [xukechun/Vision-Language-Grasping] About the test metrics (Issue #13) 我得到了和他相似的结果，我猜测这可能不是由于指标计算方式导致的，会不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun我得到了和他类似的结果，我猜测这可能不是由于指标计算方式导致的，不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun 我也得到了类似的结果，测试的平均成功率只有0.52。我是把case0到case9的成功率加起来做了平均。而且我在测试和训练时都打开了可视化界面。请问这与原文误差达到20%是什么原因导致的？ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

是的，我最近就在复现您给的代码。但是我发现您给出的训练代码中是训练了5000个epsiode。但是您在论文中说用了2000个epsiode进行训练。所以我将代码改成了2000个epsiode。并且我发现在训练代码中第一阶段和第二阶段的最大抓握次数没有区分所以我将这部分代码加入进行训练。最后得到的结果进行了两次重复的测试，结果分别为0.46329和0.52662。然后我使用了您给出的checkpoint进行了测试结果仅为0.42998，0.5333。（最开始我是直接用您给的训练代码进行训练，然后测试得出的结果的成功率也只有0.40665）所以我不清楚哪里出了问题？

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>

hanyueling commented 1 week ago

I retested our provided checkpoint, and the log files are attached. The retested results are 72.0/4.12 in average, so I did not reproduce your problem yet. Could you share your log files and the configs? Maybe that will help me figure out the problem. … -----原始邮件----- 发件人:韩悦 @.> 发送时间:2024-09-23 22:48:52 (星期一) 收件人: xukechun/Vision-Language-Grasping @.> 抄送: kuko @.>, Mention @.> 主题: Re: [xukechun/Vision-Language-Grasping] About the test metrics (Issue #13) Do you use the latest codes?…【谷歌翻译】远程服务器没有响应。查看解决方案点击重试 -----原始邮件----- 发件人:韩悦 @.> 发送时间:2024-09-23 22:09:28 (星期一) 收件人: xukechun/Vision-Language-Grasping @.> 抄送: kuko @.>, Mention @.> 主题: Re: [xukechun/Vision-Language-Grasping] About the test metrics (Issue #13) 我得到了和他相似的结果，我猜测这可能不是由于指标计算方式导致的，会不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun我得到了和他类似的结果，我猜测这可能不是由于指标计算方式导致的，不会在测试时有一些配置没有设置正确，导致指标下降？另外，我在训练和测试时都关闭了可视化界面，会和这个有关吗？ @xukechun 我也得到了类似的结果，测试的平均成功率只有0.52。我是把case0到case9的成功率加起来做了平均。而且我在测试和训练时都打开了可视化界面。请问这与原文误差达到20%是什么原因导致的？ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.> 是的，我最近就在复现您给的代码。但是我发现您给出的训练代码中是训练了5000个epsiode。但是您在论文中说用了2000个epsiode进行训练。所以我将代码改成了2000个epsiode。并且我发现在训练代码中第一阶段和第二阶段的最大抓握次数没有区分所以我将这部分代码加入进行训练。最后得到的结果进行了两次重复的测试，结果分别为0.46329和0.52662。然后我使用了您给出的checkpoint进行了测试结果仅为0.42998，0.5333。（最开始我是直接用您给的训练代码进行训练，然后测试得出的结果的成功率也只有0.40665）所以我不清楚哪里出了问题？ — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.>

抱歉有些事情耽搁了。我是在ubuntu22.04，RTX3090显卡，12GB内存上进行训练的。然后我检查了一下我安装的包与您在requirements的文件中相同。但是我当时在配置环境是碰到了pycocotools安装失败的情况，我当时参照网上的方法安装了cpython==0.29.36后，重新下载成功安装了pycocotools2.0.2。我尝试在附件中放入了我的log file，但是文件太大了加载不了。请问可以通过邮箱发给您吗？

hanyueling commented 6 days ago

@xukechun, I noticed that the test_cases provided only include arrangements with seen objects in 10 cases. Do we have a test file that incorporates the 5 test cases with unseen objects?

i also want to know the test.py to test unseen object

xukechun / Vision-Language-Grasping

About the test metrics #13