-
It would be wonderful to have some basic correlation structures available for the residuals in brms (compound symmetric, heterogeneous compound symmetric, unstructured, and auto-correlation; see gls e…
-
Hi, I've tried to reproduce the results on the task of even pairs similar to what another user was asking on issue #2.
I've noticed you've posted the hyperparameters to reproduce the results but some…
-
@lucidrains
This is a issue I'm having a while, the cross-attention is very weak at the start of the sequence.
When the transformer starts with no tokens it will relay on the cross-attention but un…
-
Hi, I just tried to use this custom all reduce kernel for speculative decoding. I set ENABLE_INTRA_NODE_COMM=1. But I found the code will stuck after several iteration. Is there some bugs of this kern…
-
I can get 23 seconds out of it but you'd think that it would be possible.
Also are the emotional stresses automatic?
-
### Anything you want to discuss about vllm.
Hello Folks,
We are using the Deep Seek Coder model for code completions and chat completions. I did try to run the benchmark scripts for that model…
-
### Description
As discussed in #32, we now implemented linear extrapolation outside the bounds of the Bernstein Polynomial.
The feature becomes active if `linear=True`.
Here is a simple Python…
-
For arxiv papers, specifically.
By the way, Vik, if you'd like a paper with challenging tables for testing marker, here's one:
https://arxiv.org/abs/2311.15131
Here's what I get from a conversi…
-
### Environment
- **Operating System**: Ubuntu Server 20.04.6 LTS
- **Memory**: 256GB
- **GPU**: NVIDIA A100-PCle-80GB
- **Python Version**: 3.10 (Conda Environment)
- **Storage**: 500GB SSD
- *…
-
Hi, I correctly run the code when I use single GPU environment (Autodl),
But when I switch to multi_GPUs environment (4*3090), this RuntimeError occured.
My hparams:
```
alg_name: "IKE"
model_na…