-
- https://arxiv.org/abs/2103.12731
- 2021
Self-attentionは、パラメータに依存しない受容野のスケーリングとコンテンツに依存した相互作用により、コンピュータビジョンシステムを改善することが期待されていますが、畳み込みのパラメータ依存のスケーリングとコンテンツに依存しない相互作用とは対照的です。
自己注意モデルは、ResNet-50のよう…
e4exp updated
3 years ago
-
### System Info
- `transformers` version: 4.44.2
- Platform: macOS-15.1-arm64-arm-64bit
- Python version: 3.10.14
- Huggingface_hub version: 0.23.3
- Safetensors version: 0.4.3
- Accelerate vers…
-
I want to train flux lora on small text crops. And usually the size of these crops are small. So my multidatabackened.json looks like this
[
{
"id": "pseudo-camera-10k-flux",
"type…
-
**Describe the bug**
Query_input's shape is [batch, pos, n_heads, d_model], and the purpose of the code where the error occurred is to reshape query_input to [batch, pos, n_heads, d_head].
I found t…
-
### Description
Perhaps I am using this function incorrectly, but I get data leaks when using `key_value_seq_lengths`. It appears as though both the `xla` and `cudnn` implementations in jax nightly…
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/axolotl-ai-cloud/axolotl/labels/bug) didn't find any similar reports.
###…
-
pip install "unsloth[cu121-ampere-torch240] @ git+https://github.com/unslothai/unsloth.git"
pip install "unsloth[cu118-ampere-torch240] @ git+https://github.com/unslothai/unsloth.git"
pip install "u…
-
### 🐛 Describe the bug
I'm trying to finetune Llama2-7B (to reproduce the experiments in a paper) using PEFT LoRA (0.124% of trainable params). However, this results in an out-of-memory (OOM) error o…
-
Hi DeepSpeed teams,
Thank you for your great work!
As the title suggests, the "01-ai/Yi-34B-Chat" model cannot run properly with DeepSpeed-MII version 0.2.3.
The encountered error message is …
-
Hello everyone,
so i am trying to extract features from images for my project and getting this error again and again.
I have successfully installed detectron2 and getting this error when trying to…