-
### Description
I have a test case that broke somewhere between jax versions 0.4.19 and 0.4.28. In particular, I am using `jax.dlpack.from_dlpack` on some PyTorch Tensors and then after hitting them …
-
I tried to use RWKV(e.g., Vision-RWKV) in CV tasks. But I found RWKV shows similar GPU memory occupancy to full-attention Transformer (like ViT) when training. I found both RWKV and Vision-RWKV only p…
thucz updated
2 months ago
-
Sure it can do basic stuff , but need deeply nested one .
also show overrides not getting stopped on in User files .
then add DSL examples for other xml based for st for generating image , pdf e…
-
#### Issue
I'm not sure about the proper workflow to use with interpreter vision after reading [this](https://changes.openinterpreter.com/log/local-iii). For the record, I separately installed moondr…
-
https://github.com/user-attachments/assets/8d02dc13-42d0-469e-b86c-46ccd24a6b5a
https://github.com/user-attachments/assets/9de83f0d-a301-4aa0-90d4-fd8d6337ca07
你好,事情是这样的。
当时我在测试如何放大视频,生成这两个…
-
[Qwen2Audio huggingface docs](https://huggingface.co/docs/transformers/main/en/model_doc/qwen2_audio)
I see there's been a couple requests for vision-language model support like LLaVa:
https:…
-
```
model, image_processor, tokenizer = create_model_and_transforms(
clip_vision_encoder_path="ViT-L-14",
clip_vision_encoder_pretrained="openai",
lang_encoder_path=model_p…
-
### System Info
- `transformers` version: 4.47.0.dev0
- Platform: Linux-5.15.0-94-generic-x86_64-with-glibc2.35
- Python version: 3.10.15
- Huggingface_hub version: 0.26.2
- Safetensors version: …
-
### System Info
CUDA Version: 12.4
GPU: A6000
### Information
- [ ] The official example scripts
- [ ] My own modified scripts
### 🐛 Describe the bug
After finetuning Llama3.2 visio…
-
### Checklist
- [ ] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue y…