-
Hi, thanks for sharing your great work!
I'm confused about the rope implementation in the cross attention module of each block.
https://github.com/Tencent/HunyuanDiT/blob/cb709308d92e6c7e8d59d0d…
-
I'm sorry but I have to ask the naive question:
If I'm testing the attention module in llama_7b inference, what arguments should I pass to this func?
For example, the input id is shape [1, 32],…
-
### Your current environment
```text
Collecting environment information...
WARNING 07-23 19:11:42 _custom_ops.py:14] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm.…
-
* Goal: Run model [Qwen2-7B](https://huggingface.co/Qwen/Qwen2-7B) on the TT Wormhole device.
* Changes: Add this directory `models/demos/wormhole/qwen2_7b`.
## Approach
We will leverage the ex…
-
os: windows
I think my environment is ready
use jupyter notebook locally
when i run these:
"from unsloth import FastLanguageModel
import torch
max_seq_length = 8192 # Choose any! We auto sup…
-
I get this error when `max_relative_positions: -1` or `max_relative_positions: -2`
```
Traceback (most recent call last):
File "/opt/conda/bin/onmt_release_model", line 33, in
sys.exit(load…
-
**Describe the bug**
I used a verified LLaMA 7B hg checkpoint, and used a single thread bmb to do inference.
But the output are just random gibberish. Not sure why?
**Minimal steps to reproduce…
-
- Currently we support Llama 3.2 1B on MLX but not tinygrad
- Add support for Llama 3.2 1B
- Might just work out of the box, if not I think the issue will be in the changes that were made to RoPE (R…
-
**Describe the bug**
I'm trying to use the Llama2 model saved with `--use-dist-ckpt` after SFT (Supervised Fine-Tuning) to train a reward model. The reward model does not require the original checkpo…
-
### System Info
ubuntu 20.04
tensorrt 10.0.1
tensorrt-cu12 10.0.1
tensorrt-cu12-bindings 10.0.1
tensorrt-cu12-libs 10.0.1
tensorrt-llm …