-
Great works!
I hope to see all codes of this project soon.
I have question about Refinement Transformer.
You mentioned that BAMM used RVQ architecture and also used Refinement Transformer for g…
-
Is there a strict requirement for GPUs that support flash_attention? I tried to test on V100, but this GPU does not support flash_attention, resulting in an error with the Runtime Error: No available …
-
Suite à une présentation "architecture tech" avec @maxime-siret @AntoineAugusti et @Brewennn ce matin, où on a abordé l'architecture mais aussi les enjeux derrière, je crée ce ticket des possibilités …
thbar updated
4 weeks ago
-
### 🚀 The feature, motivation and pitch
Flash Attention 3 (https://github.com/Dao-AILab/flash-attention) has been in beta for some time. I tested it on H100 GPUs with CUDA 12.3 and also attempted a…
-
### Feature request
I want to add the ability to use GGUF BERT models in transformers.
Currently the library does not support this architecture. When I try to load it, I get an error TypeError: Ar…
-
## Overview
The focus for this code review will be centered around GraphExamples.jsx and ImageCarousel.jsx.
Please pay attention too:
* Javascript issues
* React components
## Review Br…
-
# URL
- https://arxiv.org/abs/2406.15786
# Affiliations
- Shwai He, N/A
- Guoheng Sun, N/A
- Zheyu Shen, N/A
- Ang Li, N/A
# Abstract
- While scaling Transformer-based large language models …
-
### Feature request
Flash Attention 2 is a library that provides attention operation kernels for faster and more memory efficient inference and training: https://github.com/Dao-AILab/flash-attentio…
-
## Overview
The focus for this code review will be centered around the auditedBalanceCollection.js and AuditedBalanceSchemaInput.jsx.
Please pay attention too:
* Javascript issues
* React …
-
# ComfyUI Error Report
## Error Details
- **Node Type:** ApplyPulidFlux
- **Exception Type:** NotImplementedError
- **Exception Message:** No operator found for `memory_efficient_attention_forwa…