-
transformers-cli convert --model_type albert \
--tf_checkpoint $ALBERT_BASE_DIR/model.ckpt-64000 \
--config $ALBERT_BASE_DIR/albert_config.json \
--pytorch_dump_output $ALBERT_BASE_DIR/pytorc…
-
## Description
Exception in thread "main" ai.djl.engine.EngineException: Failed to load PyTorch native library
at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:83)
at ai.…
-
### Describe the bug
I tried to load a very simple .ort model (attached and also in the repo linked below) into my React Native app after converting it from .onnx but it gave the error `[Error: Can't…
-
-
Let’s face it. KenLM has served us well…
…but it has its limitations. It didn’t aged well as a language model architecture.
First order of business is to compute a bi directional vector representa…
-
### System Info
- `transformers` version: 4.42.4
- Platform: Linux-6.2.0-39-generic-x86_64-with-glibc2.35
- Python version: 3.11.9
- Huggingface_hub version: 0.23.4
- Safetensors version: 0.4.3
…
-
Hi,
without using transformers / accelerate blablabla, what are the constraints on the model to be tensor paralelizable ?
does it need to be a nn.Sequential ? does input dimensions need to be alwa…
-
Hi,
I'm not so much into the details of whisper or whisper.cpp and I don't know if it is currently even possible with the foundation, but it would be nice if speakers could be marked or speaker-cha…
-
**Describe the bug**
I am trying to convert the default `mamba.nemo` file (I converted [form huggingface](https://huggingface.co/nvidia/mamba2-8b-3t-4k/tree/main) .pt to .nemo) to have `tensor_parall…
-
### Bug description
See the python code below, `all_gather` and `is_global_zero` make the program hang forever, no other error messages.
If `is_global_zero` is removed, it can finish successfully.…