-
We should have a way to represent bag-of-words representations of lyrics (over a time interval).
This could be encoded as tuples: `(string, count)`, or as a dictionary: `{string: count}`.
Alternatel…
-
I am running the "basic" swbd script on a new corpus with vocab size 200000.
Just after the start of pruning it fails as below:
pre-arpa-to-arpa: success
format_arpa_lm.py: succeeded formatting ARPA …
-
Hi Dan,
I have a few questions on the scope of this project. I understand this is merely an LM creation tools with Bells and whistles to optimize the perplexity and such.
I have 2 major points that …
-
OK, prune_lm_dir.py is now working; there is an example of its use in egs/self_test/.
Testing is needed now, and comparisons with other toolkits' pruning methods.
Unfortunately, getting the perplexity…
-
```
I'm not sure if this is a feature or a bug, but when using estimate-ngram
with -v option, the words specified in the vocabulary but that are not seen
in the training data do not appear in the resu…
-
See #639.
> @kkm000: What I can do is ignore highest-order n-grams down until there is an order of non-zero cardinality. I think this is what actually happens here. This could be useful as it can pot…
-
Adding new abbreviations to libpostal involves 4 steps:
1. Edit a text file in [dictionaries](https://github.com/openvenues/libpostal/tree/master/resources/dictionaries)
2. Run `python scripts/geodata…
-
```
I'm not sure if this is a feature or a bug, but when using estimate-ngram
with -v option, the words specified in the vocabulary but that are not seen
in the training data do not appear in the resu…
-
```
I'm not sure if this is a feature or a bug, but when using estimate-ngram
with -v option, the words specified in the vocabulary but that are not seen
in the training data do not appear in the resu…
-
```
I'm not sure if this is a feature or a bug, but when using estimate-ngram
with -v option, the words specified in the vocabulary but that are not seen
in the training data do not appear in the resu…