-
One of the checkpoints from #8349
Upon turning on optimised attention on falcon7B prefill, we discovered that there is a segmentation fault when it is ran on 1k or 2k sequence lengths.
It is occurr…
-
when change saved, the diff is still there
-
On Jetson orin nano 8G
when `make chat `
```
(TinyChatEngine) cpi@ubuntu:~/github/mit-han-lab/TinyChatEngine/llm$ make chat
CUDA is available!
src/Generate.cc src/GPTBigCodeGenerate.cc src/GP…
-
I created now all tracks for the next week 35 from Fra Kåre in the BMM backend and uploaded only norwegian.
Then I logged in to the BMM Upload tool and uploaded the translation for english. Here th…
-
### 🐛 Describe the bug
## Description
Before https://github.com/pytorch/pytorch/pull/123732, when running with AOTI, the SDPA pattern can be hit and `torch.ops.aten._scaled_dot_product_flash_atten…
-
in the file Model.py, the forward function of class ParserModel,
`arc_logit = self.arc_biaffine(x_arc_dep, x_arc_head)`
and in the file Layer.py, the forward function of class Biaffine,
`biaffine …
-
Need to restructure the output a bit in
https://en.wikipedia.org/wiki/METAR
- Location information & type of measurement station
- Cloud
- Wind
- Visibility
- Temperature
- Sky / Clouds
- Humi…
-
### 🚀 The feature, motivation and pitch
When using `einsum` in performance sensitive code (and not caring about gradients), it is not good that it allocates a new tensor for the result. Then `numpy` …
-
Paddle3D的BevFormer训练启动时报
OSError: (External) CUBLAS error(15).
详细LOG:
File "/root/jiajinrang/Paddle3D/paddle3d/models/detection/bevformer/bevformer.py", line 149, in obtain_history_bev
prev_bev …
jjrCN updated
8 months ago
-
wxthu updated
2 years ago