real-stanford / flingbot

[CoRL 2021 Best System Paper] This repository contains code for training and evaluating FlingBot in both simulation and real-world settings on a dual-UR5 robot arm setup for Ubuntu 18.04
https://flingbot.cs.columbia.edu/
106 stars 25 forks source link

Readme Error! 'tasks_path' is a unrecognized arguments!! #10

Open mamimaka opened 1 year ago

mamimaka commented 1 year ago

Hi @huy-ha!

When I using the commend:

python run_sim.py --tasks_path flingbot-rect-train.hdf5 --num_processes 16 --log flingbot-train-from-scratch --action_primitives fling

to train Flingbot, an error occured: Dynamic Cloth Manipulation: error: unrecognized arguments: --tasks_path flingbot-rect-train.hdf5

I also tried to change the argument tasks_path, like tasks and datasets_path. But all of these changes were unhelpful.

I need your help, dear huy-ha.

adakan4 commented 1 year ago

Having this issue as well.

adakan4 commented 1 year ago

@mamimaka Replacing --tasks_path with --tasksseemed to work for me. I don't know why the readme is incorrect. In the config parser at the beginning of utils.py, lines 28-30, you can see that the argument which contains the path to the tasks, is called --tasks, unlike what is said in the README.

    parser.add_argument('--tasks', type=str,
                        default='configs_2500_train.pkl',
                        help='path to tasks pickle')
zcswdt commented 1 year ago

@mamimaka Replacing --tasks_path with --tasksseemed to work for me. I don't know why the readme is incorrect.

Have you successfully run the code for this warehouse?

adakan4 commented 1 year ago

Yes

zcswdt commented 1 year ago

Yes

I have set up the environment, but running the evaluation code will cause the same error as in issue 7(https://github.com/columbia-ai-robotics/flingbot/issues/7). Can you help me? Thank you very much.

adakan4 commented 1 year ago

@zcswdt I am happy to help. I have been working with this codebase for a little over a month now, but a warning that my knowledge isn't perfect.

I also ran into issue #7, which I should note is similar to issue #2.

FlingBot uses ray to run multithreading. Essentially, for each pass of the main loop in run_sim.py, n SimEnv instances are run, where n is the num_processes. Using ray makes debugging slightly more difficult, but it is still very possible.

If you are indeed having an error on the same line as issue #7, then it is occuring before run_sim enters the main loop, while it is still initializing these SimEnv objects. Following the function rabbit hole, the error is occuring in setup_envs in utils.py, on a line where ray is called to initialize the SimEnv instances.

156  ray.get([e.setup_ray.remote(e) for e in envs])

It isn't immediately obvious, but this line is creating n ray workers, each of which is calling the init function of the SimEnv class, found in simEnv.py. The error pretty much means that when one of the ray workers (probably the first one to try) tried to initialize a simEnv instance, it errored out and "died" before it finished its task (initializing the instance). This is not an issue with ray, but rather with the code that the ray worker was trying to perform.

The ray workers do output a more specific error, but right now it is only written to a folder of ray logs, which aren't very convenient. On line 39 of run_sim.py, where ray is initialized, you can change log_to_driver to be true, like this:

39  ray.init(log_to_driver=True)

Now, the ray workers will print their error messages to the same terminal as the other error messages. Reading that message should be able to give you more insight as to what the problem is. The console logs coming from the ray workers should have a prefix that looks something like this:

(SimEnv pid=3614500)
adakan4 commented 1 year ago

When I had this issue, the ray logs pointed me to an error on lines 125-128 of simEnv.py, in the function setup_env. This function is called whenever a simEnv object is initialized.

125        pyflex.init(
126            not self.gui,  # headless: bool.
127            True,      # render: bool
128            720, 720)  # camera dimensions: int x int

This error can be caused by a bunch of things, and because pyflex is mostly a black box, it isn't very easy to debug. If this is your issue, I can go over some of the possible solutions. The issue is also covered by an Issue in the PyFlex repo.

If you have any other questions about the code base, please don't hesitate to ask.

zcswdt commented 12 months ago

When I had this issue, the ray logs pointed me to an error on lines 125-128 of simEnv.py, in the function setup_env. This function is called whenever a simEnv object is initialized.

125        pyflex.init(
126            not self.gui,  # headless: bool.
127            True,      # render: bool
128            720, 720)  # camera dimensions: int x int

This error can be caused by a bunch of things, and because pyflex is mostly a black box, it isn't very easy to debug. If this is your issue, I can go over some of the possible solutions. The issue is also covered by an Issue in the PyFlex repo.

If you have any other questions about the code base, please don't hesitate to ask.

Thank you very much. After two weeks of setting up the environment, I have finally made it work. Thank you for your help.

zcswdt commented 12 months ago

When I had this issue, the ray logs pointed me to an error on lines 125-128 of simEnv.py, in the function setup_env. This function is called whenever a simEnv object is initialized.

125        pyflex.init(
126            not self.gui,  # headless: bool.
127            True,      # render: bool
128            720, 720)  # camera dimensions: int x int

This error can be caused by a bunch of things, and because pyflex is mostly a black box, it isn't very easy to debug. If this is your issue, I can go over some of the possible solutions. The issue is also covered by an Issue in the PyFlex repo.

If you have any other questions about the code base, please don't hesitate to ask.

I have a problem that I don't understand, which is finding the grip point of my arms when flipping. Find a maximum point, then subtract the gripping distance from the abscissa of this point, and add the gripping distance to obtain the gripping point of both arms. What if these two points are not on the fabric? def get_action_params(self, action_primitive, max_indices): https://github.com/columbia-ai-robotics/flingbot/blob/c5f35c785c6bc5d49282fb10117aa66ca04bf144/environment/simEnv.py#L517C64-L517C64

adakan4 commented 12 months ago

I have a problem that I don't understand, which is finding the grip point of my arms when flipping. Find a maximum point, then subtract the gripping distance from the abscissa of this point, and add the gripping distance to obtain the gripping point of both arms. What if these two points are not on the fabric? def get_action_params(self, action_primitive, max_indices): https://github.com/columbia-ai-robotics/flingbot/blob/c5f35c785c6bc5d49282fb10117aa66ca04bf144/environment/simEnv.py#L517C64-L517C64

If the grasp points outputted by get_action_params aren't on the cloth, the simulation catches that during or after the cloth is flung. This method is implemented because FlingBot is working without a cloth mask, so it can't know whether the action will grasp the cloth until it is attempted. The detection method for a grasp where neither end touches the cloth is as follows:

The first check is immediately after the robot graspers are moved into position and lifted, on line 305 of simEnv.py, where is_cloth_grasped is called, which checks to see if at least one arm is grasping the cloth. The second place is on line 475 of simEnv.py, where after the cloth is flung, there is a check to see if the cloth moved, and if the cloth didn't move noticeably, then the interaction is ended.

I hope that answers your question! In the future please open a new issue to keep things a little more organized.

zcswdt commented 12 months ago

第二个地方是simEnv.py的第475行

Thank you very much for your reply. I noticed this place when I read the code for the first time, but I didn't understand why this judgment was added: if not self.is_cloth_grasped(): self.terminate = True return. I will go back and read the code again to understand the logic. Thank you for your patience guidance, it's great!

zcswdt commented 12 months ago

I have a problem that I don't understand, which is finding the grip point of my arms when flipping. Find a maximum point, then subtract the gripping distance from the abscissa of this point, and add the gripping distance to obtain the gripping point of both arms. What if these two points are not on the fabric? def get_action_params(self, action_primitive, max_indices): https://github.com/columbia-ai-robotics/flingbot/blob/c5f35c785c6bc5d49282fb10117aa66ca04bf144/environment/simEnv.py#L517C64-L517C64

If the grasp points outputted by get_action_params aren't on the cloth, the simulation catches that during or after the cloth is flung. This method is implemented because FlingBot is working without a cloth mask, so it can't know whether the action will grasp the cloth until it is attempted. The detection method for a grasp where neither end touches the cloth is as follows:

The first check is immediately after the robot graspers are moved into position and lifted, on line 305 of simEnv.py, where is_cloth_grasped is called, which checks to see if at least one arm is grasping the cloth. The second place is on line 475 of simEnv.py, where after the cloth is flung, there is a check to see if the cloth moved, and if the cloth didn't move noticeably, then the interaction is ended.

I hope that answers your question! In the future please open a new issue to keep things a little more organized.

Do you have an email? I would like to ask you a question about how to visualize the network structure of Cloth Funnels. Thank you

zcswdt commented 9 months ago

I have a problem that I don't understand, which is finding the grip point of my arms when flipping. Find a maximum point, then subtract the gripping distance from the abscissa of this point, and add the gripping distance to obtain the gripping point of both arms. What if these two points are not on the fabric? def get_action_params(self, action_primitive, max_indices): https://github.com/columbia-ai-robotics/flingbot/blob/c5f35c785c6bc5d49282fb10117aa66ca04bf144/environment/simEnv.py#L517C64-L517C64

If the grasp points outputted by get_action_params aren't on the cloth, the simulation catches that during or after the cloth is flung. This method is implemented because FlingBot is working without a cloth mask, so it can't know whether the action will grasp the cloth until it is attempted. The detection method for a grasp where neither end touches the cloth is as follows:

The first check is immediately after the robot graspers are moved into position and lifted, on line 305 of simEnv.py, where is_cloth_grasped is called, which checks to see if at least one arm is grasping the cloth. The second place is on line 475 of simEnv.py, where after the cloth is flung, there is a check to see if the cloth moved, and if the cloth didn't move noticeably, then the interaction is ended.

I hope that answers your question! In the future please open a new issue to keep things a little more organized.

Hello, when I was performing training, the author did not seem to provide how many steps to end the training. I also don’t know how long it takes to get a good model. Can you explain it? Thank you very much

huy-ha commented 9 months ago

@zcswdt you can refer to my answer to your question in #7

zcswdt commented 8 months ago

@zcswdt you can refer to my answer to your question in #7

I successfully ran a training process, but I noticed that as the number of training iterations increased, my memory was gradually consumed until all of it was used up, causing the program to crash. I have 64GB of memory. I'm not sure what's causing this. My driver version is CUDA 11.4, but when I check with 'nvcc -V', it shows CUDA 10.0