-
So i was working with both swinv2_tiny_window8_256 and swinv2_base_window12to16_192to256 and noticed that it was not loading with torchseg.DeepLabV3Plus
```
model = torchseg.DeepLabV3Plus(
"…
-
I'm tried to convert model to coreML
firs tI tried it in terminal using this code
and generate two encoder and decoder files
```
import torch
from PIL import Image
from torchvision import tran…
-
I am working with [madlad400 ](https://huggingface.co/google/madlad400-3b-mt) which is a encoder decoder model based on T5 architecture. I am able to load it in TensorRT LLM in the bfloat16 type . I w…
-
Develop NN model. Use input from medical jargon lookup tables as test data
-
**Description**
I am using the Sagemaker Triton Inference Server containers to run a MultiModel endpoint. One of the models is a MT5 model. I am trying to optimise for the latency and think I am losi…
-
### System Info
Python version 3.11
- `transformers` version: 4.42.3
- Platform: Windows-10-10.0.22631-SP0
- Python version: 3.11.0
- Huggingface_hub version: 0.23.4
- Safetensors version: 0.4.2…
-
### 🐛 Describe the bug
I am using Google Colab with T4 runtime type. The issue is, torch.cuda.empty_cache() cannot clear the ram on GPU for the first instance of nn.Moudle set to cuda. This also happ…
-
I cannot _sheeprl-eval_ my trained model, since the keys in the world model's state_dict have different names:
Stacktrace
Error executing job with overrides: ['checkpoint_path=/home/drt/Deskto…
-
create an encoder-decoder model:
```Python
def get_encoder(input_shape):
input_tensor = keras.Input(input_shape, dtype='float32')
x1 = keras.layers.Conv2D(8, 3, padding='same')(input_ten…
-
I noticed that you use an encoder-decoder model T5 but not a decoder-only model as the source LLM due to "easily get each input doc‘s hidden_states separately".
If I use a decoder-only model, get eac…