-
![Image](https://github.com/user-attachments/assets/459f5917-ac00-449c-8e15-b4bb3d840255)
y-axis is MFU and x-axis is training step.
I'm testing qwen 72b with huggingface trainer and whenever i trai…
-
Hi, thanks for your work. I am confused with the output : the confidence socre.
I tested enzyme 2q8m with its substrate fbp (as shown in the figure). I got the confidence score of 0.66. However, th…
-
### Describe the feature request
Wasm Relaxed SIMD includes integer dot product instructions, which will map to VNNI instructions on X86-64 platforms with AVX-VNNI (on ARM maybe SDOT, but I haven't t…
-
#### Subpoint 4.1: Limited Recognition Based on Training Patterns
**Ticket Title**: Address Data Drift from Variability in Training Patterns
**Description**: Investigate and implement strategi…
-
Hi.
I'm trying to train SWIN-L backbone based model on hico-det.
```
# Training
DETR=advanced python main.py --backbone swin_large --use-checkpoint \
--drop-path…
-
I recently attempted to use the BGE-M3 model by loading it with SentenceTransformer. However, I noticed suboptimal performance. I observed that the sample code on Huggingface uses the loading method `…
-
I'm observing performance regressions for bert and bart model inference with jax mainline compared to jax-v0.4.34 on both x86 and arm64 cpu platforms. The performance drop is around 50%. I have root-c…
-
### Have you completed your first issue?
- [X] I have completed my first issue
### Guidelines
- [X] I have read the guidelines
- [X] I have the link to my latest merged PR
### Latest Merged PR Lin…
-
### Feature request
I would like to request that BetterTransformer not be deprecated.
### Motivation
I have come to rely on BetterTransformer significantly for accelerating RoBERTa and BERT models.…
-
Hi! I hope you're doing well. I have a couple of questions regarding the class handling in your code.
In lesion_dataset.py and hr_idrid_2880x1920-slide.py, the classes are defined as ['bg', 'EX…