-
First of all, I found I had to change ./configure by removing the leading `../` and trailing `/` in the path:
```
(cd build >/dev/null 2>&1 && cmake -DEIGEN3_INCLUDE_DIR=eigen .. "$@")
```
so…
-
I am generating pre training data for hindi, I am using sentence piece vocab for it. Getting the following error.
```
python build_pretraining_dataset.py --corpus-dir data --vocab-file spie
ce.voca…
-
The current export options are for
* WARC
* WARC.gz
* WARC-with-resources
* WARC-with-resources.gz
* CSV
#245 suggests adding ZIP as an option and #233 calls for ways of restricting the reso…
-
Currently, it's ~70%. We could try using a bigger batch but it also depends on language.
[GCP console for translate-mono task](https://console.cloud.google.com/monitoring/dashboards/builder/a6c8749…
-
Traceback (most recent call last):
File "src/run_uie.py", line 560, in
main()
File "src/run_uie.py", line 296, in main
raw_datasets = load_dataset(
File "/root/miniconda3/envs/inst…
-
The metadata field names in the metadata view and the VC builder are for the user quite abundant and hard to understand, for example „pubDate“ or „corpusEditor“.
There could be different solutions …
-
In lieu of an automated builder, I run the x/build tests in Travis CI. @andybons has fixed two problems with cmd/cl and with cloud.google.com depedencies, but this has revealed a number of underlying …
-
I can't figure out what causes this, but it doesn't happen all the time. A few days ago, it happened, but then on my next `archive`/`consume` cycle it didn't.
Just upgraded to 2.2.2 etc.
```
ebooks…
-
A la https://github.com/tensorflow/datasets/issues/120, it would be helpful to have an estimate of how large each dataset is before downloading. Ideally, a breakdown by feature would be nice.
Curr…
-
Hi. I've just tried to compile the lmplz and faced with the Segmentation fault. Moreover, I was facing with the errors while installation KenLM with Boost version 1.65 which actually i could resolve. …