Meng Li
·
Qi Zhao
·
Shuchang Lyu
·
Chunlei Wang
·
Yujing Ma
·
Guangliang Cheng
·
Chenguang Yang
## Highlight!!!! This repo is the implementation of "OVGNet: An Unified Visual-Linguistic Framework for Open-Vocabulary Robotic Grasping". we refer to [Vision-Language-Grasping](https://github.com/xukechun/Vision-Language-Grasping), [GroundingDINO](https://github.com/IDEA-Research/GroundingDINO), [VL-Grasp](https://github.com/luyh20/VL-Grasp). Many thanks to these excellent repos. ## Demo Setting * **Novel** indicates the **unseen** objects in training. * **Base** denotes the **seen** objects in training. * Battery and power drill are novel classes, which belong to hard task. * Apple and pear are base classes, which belong to simple task. ## Demo Video [Demo](https://github.com/cv516Buaa/OVGNet/assets/94512783/6a4a1f64-6c7f-4012-8774-60babf933290) ## Dataset * [OVGrasping](https://pan.baidu.com/s/113wBIJ-hWnSJNkWlngPqAg?pwd=8667) follows GroundingDINO data format. * The OVGrapsing dataset comprises 117 categories and 63,385 instances. * Instances are sourced from three distinct origins: RoboRefIt, GraspNet, simulated environment. * The dataset is divided into two categories: the base category consists 51,857 instances, and the novel category comprises 11,528 instances. ## Installation * Ubantu==18.04 * Python==3.9 * Torch==1.11, Torchvision==0.12.0 * CUDA==11.3 * checkpoint==[OVGANet](https://pan.baidu.com/s/13j4XBza1LNzsh-5RSfdFiQ?pwd=f3md) * assets==[assets](https://pan.baidu.com/s/1vUestnCMZKZU5Kb2lC1LMA?pwd=uov1) **please add the assets into OVGNet folder**