-
Beam search is generally not supported by our models, though the flag exists. It appears to be supported in `lstm`; it is unclear to me whether it's supported by `feature_invariant_transformer`, `tran…
-
The current implementation of the `get_beam_search_score` method in the vllm/sequence.py seems to incorrectly include the prompt length in the sequence length when calculating the beam score. This dev…
-
Unable to use KenLM rescore due to missing logprobs on transcribe.
**Steps/Code to reproduce the bug**
1. Cloned the repo [7916269](https://github.com/NVIDIA/NeMo/commit/79162696ea8c48734a260dd2…
-
Hi,
I am getting around 3% wer in fast-beam-search and greedy-search. However, I am getting 70% WER when I use fast-beam-search-ngram. My decode configuration looks as below. I am using pruned_tran…
-
I am trying to decode model trained with receipe **pruned_transducer_stateless7_streaming**. I am able to successfully decode with fast beam search (without LG), however, when I try to decode with LG …
-
Running example code from pdf causes no commandBuffer found error
```bash
(tinygrad) ➜ tinygrad git:(master) DEBUG=2 BEAM=2 python -c "from tinygrad import Tensor; Tensor.rand(16,3,64,64).conv2d(…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Problem description
Steps to reproduce:
1. Switch to BIM workbench
2. Click on the BIM_Beam command
3. Se…
-
This only happens with `BEAM=1`. `BEAM=0`, `BEAM=2`, `BEAM=3` all work fine
This happens because exo runs tinygrad inference on another thread.
Example command to reproduce: `DEBUG=6 BEAM=1 python3 …
-
investigate beam search to do a greedy search of the HS Code tree.
-
```
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers_cfg.grammar_utils import IncrementalGrammarConstraint
from transformers_cfg.generation.logits_process import Gramm…