-
Hi~
Link: https://github.com/microsoft/Oscar/blob/master/VinVL_DOWNLOAD.md
Excuse me, why is the Pre-trained Models of coco_caption empty in "VinVL_DOWNLOAD.md" file?
The link to the file is sho…
-
I am looking to create video processing pipeline in browser, and I need to pass pre- and post- processing metadata along with the video stream. WebVTT mentions [metadata text](https://w3c.github.io/we…
-
As we know that the grammar is different between English and Chinese, could neuraltalk2 work well if I describe my JPG files by using Chinese?
-
hi,
can you explain how to regenerate sentences using entitity and relations.
-
-
## Description
Followed https://github.com/NVIDIA/TensorRT/blob/release/9.1/demo/HuggingFace/notebooks/blip.ipynb to convert a blip-large model into TensorRT engine.
Points to note about blip-la…
-
## タイトル: 視覚と言語における欠落要素:コミック理解に関する調査
## リンク: https://arxiv.org/abs/2409.09502
## 概要:
近年、視覚言語モデルは、文書理解、画像による質問応答、グラウンディングなど、幅広いタスクにおいて、しばしばゼロショット設定で高い性能を発揮できる多用途なシステムへと進化してきました。複雑で多面的な分野である漫画理解は、これら…
-
### Environment
🪟 Windows
### System
windows10
### Version
???
### Desktop Information
sillytavern
### Describe the problem
I'm getting the image error when I try to create images of the char…
-
Hey!
I just finished reading your paper -- amazing work and the results look awesome!
I had one query regarding your model capabilities -- As I understand, at inference time, you do take in the …
-
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '.DEPLOYMNET'
2. Click on [https://sap-photo-cap.streamlit.ap…