corpus-statistics Search Results

1000+ results
for corpus-statistics

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

wanghaisheng/awesome-ocr #82

Adnan Ul-Hasan的博士论文-第六章历史文档的 OCR

Preserving the literary heritage is important to gain valuable insights into human history and to gain knowledge about the different aspects of our ancestors’ lives. Documents, whether written on le…

wanghaisheng updated 5 years ago
2
google/sentencepiece #181

Frequency weighted training sentences

I don't know if this is already implemented or if there's a workaround for this. Sometimes, due to the large amount of data, what we have are training sentences that are already uniquely condensed (a…

feddyfedfed updated 6 years ago
2
castorini/anserini #312

Paragraph indexing

Raised by @tuzhucheng as part of #311: How should we handle paragraph indexing in a more generic way? We **shouldn't** have separate Wikipedia and WikipediaParagraph collections. There should be a mor…

lintool updated 5 years ago
14
KorAP/Kalamar #25

Statistical information of virtual corpora

Display of statistical information of virtual corpora (number of documents, texts, tokens, sentences)

hebasta updated 6 years ago
2
tiberiu44/TTS-Cube #7

audio samples on English dataset

Hello, Thank you for the wonderful repository. I read that you're currently training on LJSpeech dataset for english TTS. Do you have any updates on audio samples? Also would you be able t…

G-Wang updated 6 years ago
3
airbus-seclab/cpu_rec #3

Corpus needs more 6502 samples

It fails to recognize the following files as 6502 code: - - `osi_bas.bin` from

trou updated 6 years ago
4
cbaziotis/ekphrasis #7

Warning regarding using TextPreProcessor as a preprocessing …

As it can be seen in the code sample below, we get different results if * we pre-process a text with TextPreProcessor _**text_processor**_ and then create an Example with a torchtext.data.Field() wi…

davidalbertonogueira updated 6 years ago
1
marian-nmt/marian-dev #64

Save training parameters in separate *.npz file to allow sea…

Continuation of training is a bit shady right now. Adam optimizer statistics are not being saved and cannot be used for resumed training. They should be saved in a special *.npz file next to the model…

emjotde updated 6 years ago
12
DARIAH-DE/TopicsExplorer #58

Visualizations not displaying

I ingested a corpus of about 2100 short documents (UTF-8, no XML markup) and the progress bar showed successful completion of all the processing steps. (I used a stop word list of my own; I chose 100 …

juliaflanders updated 6 years ago
5
opencog/link-grammar #798

Nondeterministic behavior of link-parser

`link-parser` returns different parses when parsing the same corpus file multiple times. I carried out two tests with 29 and 30 runs at different points in time. The first test gave me two different v…

alexei-gl updated 6 years ago
21

上一页 1...86 87 88 89 90 91 92...100 下一页

1000+ results for corpus-statistics

1000+ results
for corpus-statistics