Closed hzdzkjdxyjs closed 8 months ago
Hi @hzdzkjdxyjs,
Thank you for your interest in our work. The GranD creation pipeline is released and all the source codes are available at mbzuai-oryx/groundingLMM/tree/main/GranD. Please refer to docs/GranD.md for more details.
Please let me know if you have any questions. Thank You.
Thank you very much for answering my question so quickly. I think the document you provided earlier about how to run this code was very detailed and very suitable for us to quickly apply your code. I wonder if you could also discuss the parameter settings and other related issues in detail, just like on this page?
Thank You @hzdzkjdxyjs,
If I understood your question correctly, you are interested in running the GranD Automated Annotation pipeline from scratch.
Note that the annotations pipeline contains 4 levels and in total 23 steps. In each level we ran multiple SoTA vision-language models and pipeline scripts to build image-scene graphs out of raw predictions.
The process is detailed in our paper and the script run_pipeline.sh provides step-by-step guide to implement/run the pipeline. The corresponding environments used can be found at environments.
Please go through the run_pipeline.sh
script thoroughly and let me know if you have any questions. I hope it will help.
I'm very sorry to take up your valuable academic time. I would like to ask how I should set the parameters here, and what kind of command I should use to run the code.
Hi @hzdzkjdxyjs,
That's a bash script that takes few command line arguments as detailed below,
IMG_DIR
-> path to the directory containing images on which you want to run the pipelinePRED_DIR
-> path to the directory where the predictions will be savedCKPT_DIR
-> path to the directory containing all the checkpoints. For downloading the checkpoints you have to consult the README of each respective model.SAM_ANNOTATIONS_DIR
-> path to the directory containing SAM annotations (.json file)First you have to create all the environments listed in environments. For example,
conda create --name grand_env_1 --file requirements_grand_env_1.txt
conda create --name grand_env_2 --file requirements_grand_env_2.txt
...
...
...
conda create --name grand_env_9 --file requirements_grand_env_9.txt
conda create --name grand_env_utils --file requirements_grand_env_utils.txt
Second, you have to download all the checkpoints in your CKPT_DIR
direcotry.
# For Landmark detection
git lfs install
git clone https://huggingface.co/liuhaotian/llava-v1-0719-336px-lora-merge-vicuna-13b-v1.3
# For Depth Estimation
wget https://github.com/isl-org/MiDaS/releases/download/v3_1/dpt_beit_large_512.pt
# For Image Tagging
Download it from [recognize-anything/tag2text_swin_14m.pth](https://huggingface.co/spaces/xinyu1205/recognize-anything/blob/main/tag2text_swin_14m.pth) & [recognize-anything/ram_swin_large_14m.pth](https://huggingface.co/spaces/xinyu1205/recognize-anything/blob/main/ram_swin_large_14m.pth)
# For Co-DETR Detector
Please use [google drive link](https://drive.google.com/drive/folders/1asWoZ3SuM6APTL9D-QUF_YW9mjULNdh9?usp=sharing) to download `co_deformable_detr_swin_large_900q_3x_coco.pth` checkpoints.
# For EVA-02 Detector
Download it from [eva02_L_lvis_sys.pth](https://huggingface.co/Yuxin-CV/EVA-02/blob/main/eva02/det/eva02_L_lvis_sys.pth) & [eva02_L_lvis_sys_o365.pth](https://huggingface.co/Yuxin-CV/EVA-02/blob/main/eva02/det/eva02_L_lvis_sys_o365.pth)
# For POMP
Download it from [vit_b16_ep20_randaug2_unc1000_16shots_nctx16_cscFalse_ctpend_seed42.pth.tar](https://drive.google.com/file/d/1C8oU6cWkJdU3Q3IHaqTcbIToRLo9bMnu/view?usp=sharing) & [Detic_LI_CLIP_R5021k_640b64_4x_ft4x_max-size_pomp.pth](https://drive.google.com/file/d/1TwrjcUYimkI_f9z9UZXCmLztdgv31Peu/view?usp=sharing)
# For GRIP
wget -c https://datarelease.blob.core.windows.net/grit/models/grit_b_densecap_objectdet.pth
# For OV-SAM
Download it from [HarborYuan/ovsam_models/blob/main/sam2clip_vith_rn50x16.pth](https://huggingface.co/HarborYuan/ovsam_models/blob/main/sam2clip_vith_rn50x16.pth)
# For GPT4RoI
Follow the instructions at [GPT4RoI/Weights](https://github.com/jshilong/GPT4RoI?tab=readme-ov-file#weights) to get GPT4RoI weights.
Third you need to have some images in the IMG_DIR
. Fourth if you are running on SAM images, you have to prepare SAM_ANNOTATIONS_DIR
containing SAM json files, otherwise you may skip it and remove ov-sam
from the pipeline and adjust add_masks_to_annotations.py
script accordingly. Finally you can run 'run_pipeline.sh' script using the following command.
bash run_pipeline.sh <path to the directory containing images> <path to the directory for storing predictions> <checkpoints directory path> <path to the directory containing SAM annotations.>
I agree that the pipeline is not straightforward and this is because it involves running many off-the-shelf models that have different dependencies. We will welcome any pull request improving the pipeline.
Thanks and Good Luck :)
Thank you very much for taking the time out of your busy schedule to reply to my question. I find this project to be a very interesting and meaningful endeavor. I am amazed at the extremely excellent results produced by this demo. I will continue to follow your latest progress and actively try using your model. Once again, I express my respect to you and your team.
Hi @mmaaz60, I am unable to build the requirements for the dataset via the command you suggested : conda create --name grand_env_1 --file requirements_grand_env_1.txt
. I get the error:
Solving environment: failed
PackagesNotFoundError: The following packages are not available from current channels:
- async-timeout==4.0.2=pypi_0
- terminaltables==3.1.10=pypi_0
- ipython==8.14.0=pypi_0
- pytz==2023.3=pypi_0
- groundingdino==0.1.0=dev_0
- openai-whisper==20230314=pypi_0
- async-lru==2.0.3=pypi_0
- jupyter-events==0.6.3=pypi_0
- chardet==5.2.0=pypi_0
- codecov==2.1.13=pypi_0
- aiosignal==1.3.1=pypi_0
- numpy==1.24.3=pypi_0
- peft==0.3.0=pypi_0
- fastapi==0.100.0=pypi_0
- aliyun-python-sdk-kms==2.16.1=pypi_0
- awq==0.1.0=pypi_0
- mmcv-full==1.5.0=dev_0
- multiscaledeformableattention==1.0=pypi_0
- pycocotools==2.0.6=pypi_0
- multiprocess==0.70.15=pypi_0
- importlib-resources==6.0.0=pypi_0
- pybind11==2.11.1=pypi_0
- scipy==1.11.1=pypi_0
- typepy==1.3.1=pypi_0
- isort==4.3.21=pypi_0
- mmdet==2.25.3=dev_0
- onnxruntime==1.15.1=pypi_0
- exceptiongroup==1.1.2=pypi_0
- torchvision==0.15.2+cu117=pypi_0
- supervision==0.11.1=pypi_0
- nbconvert==7.7.2=pypi_0
- httpcore==0.17.3=pypi_0
- jupyter-console==6.6.3=pypi_0
- jupyter-server-terminals==0.4.4=pypi_0
- cupy-cuda117==10.6.0=pypi_0
- qtconsole==5.4.3=pypi_0
- quant-cuda==0.0.0=pypi_0
- contourpy==1.1.0=pypi_0
- yarl==1.9.2=pypi_0
- setproctitle==1.3.2=pypi_0
- pathtools==0.1.2=pypi_0
- oss2==2.17.0=pypi_0
- deepdiff==6.3.1=pypi_0
- comm==0.1.3=pypi_0
- coverage==7.3.0=pypi_0
- imageio==2.31.1=pypi_0
- cymem==2.0.7=pypi_0
- json5==0.9.14=pypi_0
- jupyter-client==8.3.0=pypi_0
- keras==2.13.1=pypi_0
- markdown-it-py==2.2.0=pypi_0
- einops-exts==0.0.4=pypi_0
- outdated==0.2.2=pypi_0
- markupsafe==2.1.3=pypi_0
- widgetsnbextension==4.0.8=pypi_0
- pyarrow==12.0.1=pypi_0
- addict==2.4.0=pypi_0
- flatbuffers==23.5.26=pypi_0
- platformdirs==3.10.0=pypi_0
- prompt-toolkit==3.0.39=pypi_0
- shortuuid==1.0.11=pypi_0
- openxlab==0.0.15=pypi_0
- bleach==6.0.0=pypi_0
- pyproject-api==1.5.4=pypi_0
- smmap==5.0.0=pypi_0
- munkres==1.1.4=pypi_0
- pyflakes==2.1.1=pypi_0
- etils==1.3.0=pypi_0
- anyio==3.7.1=pypi_0
- dassl==0.6.3=dev_0
- huggingface-hub==0.16.4=pypi_0
- thinc==8.1.10=pypi_0
- typer==0.9.0=pypi_0
- httpx==0.24.0=pypi_0
- zstandard==0.21.0=pypi_0
- nh3==0.2.14=pypi_0
- jupyterlab-widgets==3.0.8=pypi_0
- timm==0.5.4=pypi_0
- accelerate==0.21.0=pypi_0
- tensorflow-metadata==1.13.1=pypi_0
- nltk==3.8.1=pypi_0
- pyparsing==3.0.9=pypi_0
- texttable==1.6.7=pypi_0
- openmim==0.3.9=pypi_0
- opencv-python==4.8.0.74=pypi_0
- six==1.16.0=pypi_0
- spacy-alignments==0.9.0=pypi_0
- spacy==3.6.0=pypi_0
- spacy-loggers==1.0.4=pypi_0
- langcodes==3.3.0=pypi_0
- safetensors==0.3.1=pypi_0
- wavedrom==2.0.3.post3=pypi_0
- terminado==0.17.1=pypi_0
- pure-eval==0.2.2=pypi_0
- argon2-cffi==21.3.0=pypi_0
- ninja==1.11.1=pypi_0
- pycountry==22.3.5=pypi_0
- overrides==7.3.1=pypi_0
- hjson==3.1.0=pypi_0
- nvidia-cuda-cupti-cu11==11.7.101=pypi_0
- uvicorn==0.23.1=pypi_0
- virtualenv==20.24.3=pypi_0
- python-multipart==0.0.6=pypi_0
- arrow==1.2.3=pypi_0
- wcwidth==0.2.6=pypi_0
- typing-inspect==0.9.0=pypi_0
- trax==1.4.1=pypi_0
- gdown==4.7.1=pypi_0
- websockets==11.0.3=pypi_0
- nbformat==5.9.1=pypi_0
- onnx==1.14.0=pypi_0
- astunparse==1.6.3=pypi_0
- datasets==2.14.4=pypi_0
- en-core-web-md==3.6.0=pypi_0
- decorator==5.1.1=pypi_0
- llava==1.0.0=pypi_0
- tensorflow==2.13.0=pypi_0
- pyre-extensions==0.0.29=pypi_0
- tensorflow-hub==0.14.0=pypi_0
- xtcocotools==1.13=pypi_0
- nvidia-cuda-nvrtc-cu11==11.7.99=pypi_0
- networkx==3.1=pypi_0
- absl-py==1.4.0=pypi_0
- kornia==0.6.4=pypi_0
- gradio-client==0.2.10=pypi_0
- pycryptodome==3.18.0=pypi_0
- crcmod==1.7=pypi_0
- scikit-learn==1.2.2=pypi_0
- beautifulsoup4==4.12.2=pypi_0
- toolz==0.12.0=pypi_0
- dm-tree==0.1.8=pypi_0
- pluggy==1.2.0=pypi_0
- starlette==0.27.0=pypi_0
- lit==16.0.6=pypi_0
- debugpy==1.6.7=pypi_0
- srsly==2.4.7=pypi_0
- tcolorpy==0.1.3=pypi_0
- en-core-web-trf==3.6.1=pypi_0
- fsspec==2023.6.0=pypi_0
- mmpose==0.24.0=dev_0
- nvidia-nccl-cu11==2.14.3=pypi_0
- flake8==3.7.9=pypi_0
- jupyter==1.0.0=pypi_0
- pycocoevalcap==1.2=pypi_0
- torch==2.0.1+cu117=pypi_0
- appdirs==1.4.4=pypi_0
- click==8.1.6=pypi_0
- libclang==16.0.6=pypi_0
- attributedict==0.3.0=pypi_0
- kiwisolver==1.4.4=pypi_0
- pycodestyle==2.5.0=pypi_0
- fschat==0.2.24=pypi_0
- ipywidgets==8.0.7=pypi_0
- requests==2.28.2=pypi_0
- vllm==0.1.3=pypi_0
- rouge-score==0.1.2=pypi_0
- opencv-python-headless==4.8.0.74=pypi_0
- jupyter-server==2.7.0=pypi_0
- chumpy==0.70=pypi_0
- littleutils==0.2.2=pypi_0
- fastrlock==0.8.2=pypi_0
- argon2-cffi-bindings==21.2.0=pypi_0
- rfc3986-validator==0.1.1=pypi_0
- ffmpy==0.3.1=pypi_0
- numexpr==2.8.5=pypi_0
- protobuf==4.23.4=pypi_0
- defusedxml==0.7.1=pypi_0
- preshed==3.0.8=pypi_0
- blessings==1.7=pypi_0
- pydantic==1.10.11=pypi_0
- nvidia-curand-cu11==10.2.10.91=pypi_0
- tqdm-multiprocess==0.0.11=pypi_0
- triton==2.0.0=pypi_0
- ml-dtypes==0.2.0=pypi_0
- orjson==3.9.2=pypi_0
- threadpoolctl==3.2.0=pypi_0
- nvidia-nvtx-cu11==11.7.91=pypi_0
- wandb==0.15.5=pypi_0
- rouge==1.0.1=pypi_0
- markdown2==2.4.9=pypi_0
- pyyaml==6.0=pypi_0
- jsonschema==4.18.4=pypi_0
- certifi==2023.5.7=pypi_0
- google-pasta==0.2.0=pypi_0
- matplotlib-inline==0.1.6=pypi_0
- detectron2==0.6=dev_0
- h11==0.14.0=pypi_0
- pandocfilters==1.5.0=pypi_0
- gast==0.4.0=pypi_0
- webencodings==0.5.1=pypi_0
- matplotlib==3.7.2=pypi_0
- nvidia-cufft-cu11==10.9.0.58=pypi_0
- sentencepiece==0.1.99=pypi_0
- sacrebleu==1.5.0=pypi_0
- funcsigs==1.0.2=pypi_0
- backcall==0.2.0=pypi_0
- nvidia-cudnn-cu11==8.5.0.96=pypi_0
- spacy-transformers==1.2.5=pypi_0
- sqlitedict==2.1.0=pypi_0
- googleapis-common-protos==1.59.1=pypi_0
- jinja2==3.1.2=pypi_0
- jax==0.4.13=pypi_0
- docker-pycreds==0.4.0=pypi_0
- python-json-logger==2.0.7=pypi_0
- fire==0.5.0=pypi_0
- nvidia-cuda-runtime-cu11==11.7.99=pypi_0
- semantic-version==2.10.0=pypi_0
- promise==2.3=pypi_0
- referencing==0.30.0=pypi_0
- uri-template==1.3.0=pypi_0
- asttokens==2.2.1=pypi_0
- importlib-metadata==6.8.0=pypi_0
- gitpython==3.1.32=pypi_0
- fonttools==4.41.0=pypi_0
- ipython-genutils==0.2.0=pypi_0
- tifffile==2023.8.12=pypi_0
- aiohttp==3.8.4=pypi_0
- sentry-sdk==1.28.1=pypi_0
- uc-micro-py==1.0.2=pypi_0
- stack-data==0.6.2=pypi_0
- transformers==4.33.2=pypi_0
- nvidia-cusolver-cu11==11.4.0.1=pypi_0
- cmake==3.26.4=pypi_0
- regex==2023.6.3=pypi_0
- enchant==0.0.1=pypi_0
- nvidia-cusparse-cu11==11.7.4.91=pypi_0
- tokenizers==0.13.3=pypi_0
- gym==0.26.2=pypi_0
- tzdata==2023.3=pypi_0
- fairscale==0.4.4=pypi_0
- mistune==3.0.1=pypi_0
- cryptography==41.0.3=pypi_0
- parso==0.8.3=pypi_0
- gitdb==4.0.10=pypi_0
- pillow==9.5.0=pypi_0
- wrapt==1.15.0=pypi_0
- rfc3339-validator==0.1.4=pypi_0
- humanfriendly==10.0=pypi_0
- prometheus-client==0.17.1=pypi_0
- frozenlist==1.4.0=pypi_0
- opt-einsum==3.3.0=pypi_0
- pytablewriter==1.0.0=pypi_0
- fastjsonschema==2.18.0=pypi_0
- confection==0.1.0=pypi_0
- dill==0.3.7=pypi_0
- nbclient==0.8.0=pypi_0
- pathy==0.10.2=pypi_0
- mpmath==1.3.0=pypi_0
- isoduration==20.11.0=pypi_0
- psutil==5.9.5=pypi_0
- en-core-web-sm==3.6.0=pypi_0
- entrypoints==0.3=pypi_0
- aliyun-python-sdk-core==2.13.36=pypi_0
- jupyter-core==5.3.1=pypi_0
- pyzmq==25.1.0=pypi_0
- annotated-types==0.5.0=pypi_0
- colour-runner==0.1.1=pypi_0
- tiktoken==0.3.3=pypi_0
- flash-attn==1.0.7=pypi_0
- altair==5.0.1=pypi_0
- ipykernel==6.24.0=pypi_0
- segment-anything==1.0=dev_0
- ray==2.6.3=pypi_0
- ordered-set==4.1.0=pypi_0
- scikit-image==0.21.0=pypi_0
- yapf==0.40.1=pypi_0
- sympy==1.12=pypi_0
- notebook==7.0.0=pypi_0
- tinycss2==1.2.1=pypi_0
- cycler==0.11.0=pypi_0
- lm-eval==0.3.0=pypi_0
- jupyterlab==4.0.3=pypi_0
- idna==3.4=pypi_0
- lazy-loader==0.3=pypi_0
- inspecta==0.1.3=pypi_0
- lmdb==1.4.1=pypi_0
- openai==0.27.8=pypi_0
- send2trash==1.8.2=pypi_0
- colorama==0.4.6=pypi_0
- jedi==0.18.2=pypi_0
- jaxlib==0.4.13=pypi_0
- wilds==1.2.2=pypi_0
- numba==0.57.1=pypi_0
- py-cpuinfo==9.0.0=pypi_0
- auto-gptq==0.4.1+cu117=pypi_0
- catalogue==2.0.9=pypi_0
- rpds-py==0.9.2=pypi_0
- python-dateutil==2.8.2=pypi_0
- multidict==6.0.4=pypi_0
- tabledata==1.3.1=pypi_0
- notebook-shim==0.2.3=pypi_0
- pandas==2.0.3=pypi_0
- webcolors==1.13=pypi_0
- smart-open==6.3.0=pypi_0
- pydub==0.25.1=pypi_0
- pickleshare==0.7.5=pypi_0
- coloredlogs==15.0.1=pypi_0
- h5py==3.9.0=pypi_0
- traitlets==5.9.0=pypi_0
- mccabe==0.6.1=pypi_0
- nvidia-cublas-cu11==11.10.3.66=pypi_0
- shapely==2.0.1=pypi_0
- linkify-it-py==2.0.2=pypi_0
- xxhash==3.3.0=pypi_0
- blis==0.7.10=pypi_0
- opendatalab==0.0.10=pypi_0
- jsonlines==3.1.0=pypi_0
- json-tricks==3.17.2=pypi_0
- qtpy==2.3.1=pypi_0
- murmurhash==1.0.9=pypi_0
- grpcio==1.56.0=pypi_0
- svgwrite==1.4.3=pypi_0
- zipp==3.16.2=pypi_0
- aiofiles==23.1.0=pypi_0
- pathvalidate==3.1.0=pypi_0
- spacy-legacy==3.0.12=pypi_0
- tensorflow-io-gcs-filesystem==0.32.0=pypi_0
- gin-config==0.5.0=pypi_0
- msgpack==1.0.5=pypi_0
- ogb==1.3.6=pypi_0
- awq-inference-engine==0.0.0=pypi_0
- nest-asyncio==1.5.6=pypi_0
- tensorflow-datasets==4.9.2=pypi_0
- tomli==2.0.1=pypi_0
- deepspeed==0.9.5=pypi_0
- tb-nightly==2.15.0a20230816=pypi_0
- jupyterlab-server==2.24.0=pypi_0
- sacremoses==0.0.53=pypi_0
- tensorflow-estimator==2.13.0=pypi_0
- dataproperty==1.0.1=pypi_0
- filelock==3.12.2=pypi_0
- rootpath==0.1.1=pypi_0
- jmespath==0.10.0=pypi_0
- tensorflow-text==2.13.0=pypi_0
- jupyterlab-pygments==0.2.2=pypi_0
- pygments==2.15.1=pypi_0
- soupsieve==2.4.1=pypi_0
- gradio==3.35.2=pypi_0
- pywavelets==1.4.1=pypi_0
- termcolor==2.3.0=pypi_0
- ftfy==6.1.1=pypi_0
- charset-normalizer==3.2.0=pypi_0
- llvmlite==0.40.1=pypi_0
- gym-notices==0.0.8=pypi_0
- pexpect==4.8.0=pypi_0
- bitsandbytes==0.42.0=pypi_0
- cython==0.29.36=pypi_0
- mbstrdecoder==1.1.3=pypi_0
- model-index==0.1.11=pypi_0
- einops==0.6.1=pypi_0
- jsonschema-specifications==2023.7.1=pypi_0
- mdurl==0.1.2=pypi_0
- xformers==0.0.20=pypi_0
- tornado==6.3.2=pypi_0
- babel==2.12.1=pypi_0
- ptyprocess==0.7.0=pypi_0
- pydantic-core==2.3.0=pypi_0
- rich==13.4.2=pypi_0
- packaging==23.1=pypi_0
- mmengine==0.8.2=pypi_0
- setuptools==60.2.0=pypi_0
- tqdm==4.66.1=pypi_0
- joblib==1.3.1=pypi_0
- tox==4.9.0=pypi_0
- distlib==0.3.7=pypi_0
- executing==1.2.0=pypi_0
- attrs==23.1.0=pypi_0
- mdit-py-plugins==0.3.3=pypi_0
- wasabi==1.1.2=pypi_0
- sniffio==1.3.0=pypi_0
- black==22.3.0=pypi_0
- fqdn==1.5.1=pypi_0
- more-itertools==9.1.0=pypi_0
- typing-extensions==4.7.1=pypi_0
- array-record==0.4.0=pypi_0
- urllib3==2.0.3=pypi_0
- jupyter-lsp==2.2.0=pypi_0
Current channels:
- https://repo.anaconda.com/pkgs/main/linux-64
- https://repo.anaconda.com/pkgs/main/noarch
- https://repo.anaconda.com/pkgs/r/linux-64
- https://repo.anaconda.com/pkgs/r/noarch
To search for alternate channels that may provide the conda package you're
looking for, navigate to
https://anaconda.org
and use the search bar at the top of the page.
I think its a channels
issue as a normal conda environment yml file has a section defining the channels for the packages. Your help in reproducing the environments for the dataset
is much appreciated.
Your work is of great academic value and significance, and I am very grateful for the contributions you have made. I would like to ask you about the specific operational steps for implementing the GranD Automated Annotation Pipeline. I am very grateful that you could take the time out of your busy schedule to look at my question.