-
### Search before asking
- [X] I searched the [issues](https://github.com/ray-project/kuberay/issues) and found no similar issues.
### KubeRay Component
ray-operator, apiserver
### What happened …
-
I am trying to quantize the llama-2-7b-chat-hf using the gpt fast using:-
python quantize.py --mode int4 --groupsize 32
on Kaggle using Kaggle T4*2 GPU.
I have installed pytorch nightly using:…
-
new version of transfomer, no need to use BetterTransformer, try setting attn impl to sdpa...
attn imp:
Unused kwargs: ['_load_in_4bit', '_load_in_8bit', 'quant_method']. These kwargs are not used …
-
I run quite simple flow
_Vegas("Country Pop").
withData(
Seq(
Map("country" -> "USA", "population" -> 314),
Map("country" -> "UK", "population" -> 64),
…
-
Thanks for using our 3D RNA-seq App.
Hi! Thanks for creating this great App. I have a problem with loading the geneid. The App only manages to get them from the .fast file. While if I give it a .gt…
-
I refined llama3.1 8b bnb 4bits according to your recommendations with my own train+eval dataset and saved as merged 16 bits. I now want to create an inference by loading the 16b merged model and usin…
-
My Script I ran to cause this error
`python -m examples.models.llama2.export_llama --checkpoint /Users/anthonymikinka/executorch/llama-2-7b-chat/consolidated.00.pth --params /Users/anthonymikinka/exe…
-
fantastic work. Did you find a method for downloading arxiv files for only a specific topic e.g.: physics ?
flckv updated
6 months ago
-
The below program outputs a rather mysterious error:
```rust
pub struct A {
inner: [usize; 4],
}
impl A {
#[pure]
#[requires(index < self.inner.len())]
pub fn is_valid(&sel…
-
### 🐛 Describe the bug
I'm trying to apply static quantization to a model using a `nn.TransformerEncoderLayer`.
But when running the model, I get the following error :
```
File "/envs/trans…