How to incorporate other datasets like Gibson

SJTU-ViSYS-team / VisFly

This is a fast and versatile simulator for training vision-based flight of drones.

MIT License

7 stars 1 forks source link

How to incorporate other datasets like Gibson #1

Open gaojunjie1999 opened 1 week ago

gaojunjie1999 commented 1 week ago

Thanks for your work, could you please inform me where is the habitat rendering engine used? As I don't see "make_sim" function in the code. So I don't know how to use other dataset like Gibson.

Fanxing-LI commented 1 week ago

Rendering engine is used in utils.scenedatasets.py. You need to download other datasets in datasets/, and include the folder address of the scenes you wanna use, as "scene_kwargs" and input it to xxxEnv. Maybe you can find an example in Complete Environment Definition.

gaojunjie1999 commented 1 week ago

Thanks, also I wonder if it is possible to train uav to perform tasks like image-goal-navigation/object-goal navigation like in habitat-lab?

Fanxing-LI commented 1 week ago

Of course.

You need to define another Env. Use RGB camera sensor, use drone local or external estimator(such as, the distance to object) to give appropriate reward, and provide success signal while achieving target (catch object in sight). That's just my initial and immature thought, you should carefully design and enrich the details.

gaojunjie1999 commented 1 week ago

Great thanks for your advice

gaojunjie1999 commented 1 week ago

Please correct me if I'm wrong. I noticed that although VecEnv is the base class, it is not used for parallel scene training with multi-thread communication. So I wonder if the parallelization of this env is achieved by training multiple agents in one scene. If that is the case, I wonder if the training speed here is as fast as those using multi-thread parallel scene training architecture?

Fanxing-LI commented 1 week ago

It is available to trained multiple agents in one scene. This simulator just inherits VecEnv's interfaces. The bottleneck of speed is rendering the vision. In habitatsim, the graphical computation is deployed on GPUs. I am not sure which one could be faster because I dont know how much CPU resources will be used.

gaojunjie1999 commented 5 days ago

Thanks for your patient reply. Also, what happens if collisions occur in the simulation, will the drone be bumped back of the drone just slides to a place out of the obstacle? Or will the drone stop? {042A94E9-D983-407F-94F3-437A19E8DDB6}

As you can see in the snapshot, the drone keeps hitting the obstacle and the drone seems to fly back and forth.

Fanxing-LI commented 4 days ago

In a Gym-wrapped Env (requires_grad=False), drone will be reset back to its spawn position with random states (depends on settings) if it crashes or hits on the wall.

we have sucessfully tested the auto-reset function in these datasets wiich are scanned realworld scenes. Maybe you can plot the distance to closest obstacle to show whether it is a bug or the drone just wander near the wall (not crash).

But indeed, this simulator is primarily validated on replicaCAD datasets, because the coordinates/frame of 3D reconstruction datasets (HM3D etc.) are not well uniformed. It is troublesome to specific initial position for each scene.

gaojunjie1999 commented 4 days ago

Thanks, again could you tell me how the parallel computation is achieved? Since I have not found codes related to multi-process or multi-thread computation. Although, I noticed that increasing "num_agent_per_scene" can decrease training time. I wonder if it is because of some of your designed parallel computation mechanism.

Fanxing-LI commented 2 days ago

The parallel compuataion is a standrard batch computation in torch, you can find it in dynamics.py.

You can consider it as:

scene1:
 agent(1):
...
 agent(num_agent_per_scene):
scene2:
...

scene(num_scene):

The states of all agents (num_scene * num_agnet_per_scene) are package as a batch, and used to simulate physical process.