-
You will see the problem in the text below, this is with using gpt-4o and version 0.5 of agent zero, but have similar issues with other models
User message ('e' to leave):
> Write a college level …
-
# Papers
- Sapiens: Foundation for Human Vision Models
- 메타에서 나온 Human foundation model ㄷㄷㄷ
- 2D pose estimation, body-part segmentation, depth prediction and normal prediction이 하나의 모델에서 …
-
Hi!
this is only a draft and summary of all papers and implementations of mamba.
I will put my feedback here, from Orin AGX 64Gb
Original paper:
(arXiv 2024.01) Vision Mamba: Efficient Visual…
-
```meta
Time: 2024-09-09 6:00PM Eastern
UTCTime: 2024-09-09 22:00 UTC
Duration: 2h
Location: ATL BitLab, 684 John Wesley Dobbs Ave NE, Unit A1, Atlanta, GA 30312
```
![aitl-ai-builders-septemb…
-
### Feature Name
Research about Stability.ai
### Feature Description
This a research about Stability.ai, learning more about its supported models, how it is used and many more
### Motivati…
-
### System Info / 系統信息
CogVideoX-5B
### Information / 问题信息
- [X] The official example scripts / 官方的示例脚本
- [ ] My own modified scripts / 我自己修改的脚本和任务
### Reproduction / 复现过程
Thanks for the open sou…
-
Hello everyone,
I have been working on replicating benchmarks related to video-class Large Language Models (LLMs), and I've noticed that most of these benchmarks rely on the GPT-assistant framework…
hb-jw updated
1 month ago
-
Hi,
Thank you for your outstanding work! Without a doubt, your recently published VILA v1.5 series pushes the boundaries of multimodal large language models. It is arguably the most powerful and us…
-
I'm getting poor transcription results using whisperx, specifically I am sometimes not getting any transcription out of some short videos, whereas OpenAI's official whisper model transcribes them corr…
-
Thanks for the repo and models! When trying to run demo.sh with the 34b model (commented and uncommented the relevant lines), I am getting nonsense output (with the example video and prompt):
```
##…