-
The following line of code in notebook I believe is incorrect:
`transformer_input_expanded = model.transformer[0].linear[0](transformer_input)[0]`
This is taking the hidden state of the MLP ('li…
-
### Description
This excerpt, as well as others in the article Mamba: Linear-Time Sequence Modeling with Selective State Spaces, have rendering errors
### (Optional:) Please add any files, screensho…
-
### 🐛 Describe the bug
Using nested tensors generated with `torch.narrow` as inputs to `torch.nn.functional.scaled_dot_product_attention` works fine in the forward pass of the model. However, both …
-
hi msstats team . I'm not sure this code is intended to confirm the existence of technical replicates of the data. But using **all** will return false for my result.
https://github.com/Vitek-Lab/MS…
-
```from fastfit import FastFit
model = FastFit.from_pretrained("fast-fit")
model
```
gives
```
FastFit(
(encoder): MPNetModel(
(embeddings): MPNetEmbeddings(
(word_embedding…
-
### System Info
- `transformers` version: 4.46.0.dev0
- Platform: macOS-15.0-arm64-arm-64bit
- Python version: 3.11.6
- Huggingface_hub version: 0.25.1
- Safetensors version: 0.4.5
- Accelerate …
-
Bring up Llama 3.2 model family on Wormhole, T3K and TG
-
Dear author,
I am sure that all the versions of my packages are correct. I used CUDA version 10.1 to adapt to Torch version 1.4.
However, I meet an error when Evaluation as follows:
Traceback (most…
-
Following the `README.md`, I tested the `subject_driven_generation`:
```bash
sh train_sdxl_lora_cat.sh
python3 infer.py
```
and got low-quality images from mixer model, while the vanilla lora rem…
-
### How are you running AnythingLLM?
Docker (local)
### What happened?
Docker sees my models. I start chatting in my workspace, and then I get an error "Failed to load model"
```
anythingllm |…