-
python run_awq.py --model_name Qwen/Qwen1.5-7B-Chat --task quantize
Namespace(model_name='Qwen/Qwen1.5-7B-Chat', target='aie', profile_layer=False, task='quantize', precision='w4abf16', flash_attenti…
-
When run demo.py with default parameter is correct , but with the follow command is error:
python .\demo.py --image_path .\demo.png --ckpt_path U4R/StructTable-InternVL2-1B --output_format latex
T…
-
```python
5: [rank5]: File "/workspace/megatron/core/transformer/transformer_block.py", line 493, in forward
5: [rank5]: hidden_states, context = layer(
5: [rank5]: File "/workspace/megatro…
-
- 2021
- https://arxiv.org/abs/2109.12036
自然言語は、階層的に支配された依存関係のパターンを示し、単語間の関係は線形順序ではなく構文構造に敏感である。
再帰ネットワークモデルは、曖昧なデータで訓練されると、階層的に敏感な方法で一般化できないことが多いが(McCoy et al,2020)、新しいTrans-former言語モデル(Vaswani…
e4exp updated
3 years ago
-
when I run code according to README.md:
cd MobileUNETR
cd experiments/isic_2016/exp_2_dice_b8_a2/
# the default gpu device is set to cuda:0 (you can change it)
CUDA_VISIBLE_DEVICES="0" accelerate …
-
**Describe the bug/ 问题描述 (Mandatory / 必填)**
A clear and concise description of what the bug is.
Qwen2.5-coder-14B, 单机多卡推理报错
- **Hardware Environment(`Ascend`/`GPU`/`CPU`) / 硬件环境**:
> Please delet…
-
**Describe the bug**
I followed the example on LLama3 and run into below issues on GPT-J
```
2024-11-05T15:58:25.267390+0000 | _check_compile_recipe | INFO - Recipe compiled and 1 modifiers cre…
-
## 🐛 Bug
### To Reproduce
Here's a problematic pattern where Thunder's rematerialization algorithm is not effective:
```py
from torch.nn import Linear
import torch
import thunder
bl…
-
Hello,
Currently the featuring to merge Flux LoRA into the base model does not function properly with LoRA trained by Ostris' AI-Toolkit. This appears to be due to a difference in the way the keys…
CCpt5 updated
2 months ago
-
When opening the URL (http://0.0.0.0:7860) I get the "can't reach this page" message. I don't get any errors while loading, apart from the "No module named 'triton'" one, which I assume is normal on …