-
### Description
CloudFront limits restrict the size of data corpus files that can be downloaded to 50 GB. Thus, in the case of the `big5` workload, for example, a corpus with an uncompressed size of…
-
Hi dear author,
I only want to train a small acoustics model use Gigaspeech, but I encountered some problems when I run Gigaspeech recipe in Kaldi.
.if [ $stage -le 2 ]; then
echo "======Tr…
-
##### Educational version
0.8.0
##### Orange version
3.37.0
##### Expected behavior
Doing text mining with text entered in Create Table should be possible by Editing the Domain of the…
-
Thank you very much for sharing your work. I encountered the following issue while building DGS3-T: “tensorflow_datasets.core.download.downloader.DownloadError: Failed to get url https://nlp.biu.ac.il…
-
I looked in my data folder there and the file is not there. How do I get this file?
-
Hi,
Can you please tell me how can I get download the data for the book corpus folder. The download.sh file is empty. I don't know where to get that data from?
Thanks ;)
-
## Description
OSCAR Corpus: https://oscar-corpus.com/
```
@inproceedings{ortiz-suarez-etal-2020-monolingual,
title = "A Monolingual Approach to Contextualized Word Embeddings for Mid-Resour…
-
### Is there an existing issue for this?
- [X] I have searched the existing issues
### Is your feature request related to a problem? Please describe.
Currently in order to perform BM25 based text r…
-
Hi,
Pipe operator `mlr3pipelines::PipeOpTextVectorizer` is painfully slow in comparison with `quanteda::dfm()`:
````
library(mlr3)
library(mlr3learners)
library(mlr3pipelines)
library(quante…
-
It's not clear to me that the corpus data in `tests-tatcorpus/` should stay there. Things I worry about:
* **Licensing of the data**: where's it from / is the license compatible with the rest of the…