-
The idea is to have generative fill with open source models!
The pipeline would contain the following moving parts:
1. Input image (eg. an image of a dog running through a field)
2. Input edit …
-
Just a few things I thought of
1. Create other chat handlers (Telegram? FB messenger? Text message, web interface)
2. Allow external links - if I send a URL, it will fetch the content, and add it to t…
-
Hello, @ch3cook-fdu!
Thanks for sharing your work about indoor 3d dense captioning. Recently I have tried to train the Vote2Cap-DETR(++) with different configs. I noticed that there is a slightly …
-
Thank you so much for sharing your brilliant code with us.
But could you share me the code to generate the caption for a image like the Figure 1 in your artical?
I would appreciate it if you could …
-
Hi, thanks again for contributing such good work. Just wondering have you revealed prompts(i.e., instructions) for several multi-modality tasks used in OFA-CN, especially for visual grounding task? th…
-
during bash eval_flickr.sh error
![616](https://user-images.githubusercontent.com/23289919/122150174-d1c62100-ce8f-11eb-8440-4565df225234.png)
-
你好,我在运行transformer_nsc.yml时,碰到以下问题,请问怎么解决?
Hugginface transformers not installed; please visit https://github.com/huggingface/transformers
meshed-memory-transformer not installed; please run `pip in…
-
**Describe**
Model I am using (BEIT3):
**the command I used :**
python -m torch.distributed.launch --nproc_per_node=1 run_beit3_finetuning.py --model beit3_large_patch16_224 --input_size 224 -…
-
Hi,
I am trying to follow [COOT's](https://github.com/gingsi/coot-videotext) implementation using a different dataset, ActivtiyNet-Entities. They used your model to extract the features. What steps s…
-
Hi,
Thanks for your great work.
I have a question that what is the difference between FC and Att2all?
Thanks.