corpus-data Search Results

1000+ results
for corpus-data

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

langgenius/dify #8099

Request the author's support to view the contents of the rec…

### Self Checks - [X] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones. - [X] I confirm that I am using English to su…

liulian3564 updated 4 days ago
4
linguishi/chinese_sentiment #30

AttributeError: 'str' object has no attribute 'decode'

Traceback (most recent call last): File "···/chinese_sentiment-master/data/hotel_comment/raw_data/fix_coupus.py", line 30, in fix_corpus(POS, FIX_POS) File "···/chinese_sentiment-master/da…

KyleLeith-007 updated 1 month ago
4
mreiferson/go-snappystream #17

Potential benchmark data source: Silesia corpus

I was reading about [lz4](https://code.google.com/p/lz4/) and noticed they use a [specific dataset](http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia) for the published benchmarks. It looks like …

bmatsuo updated 10 years ago
2
castorini/ura-projects #48

TREC 2024 - Workflow Record

This issue isn't a problem to be fixed, rather it is a record to keep track of undergraduate student's work on TREC 2024. In a series of comments, contributors can write about their work (at a high le…

Yuv-sue1005 updated 2 weeks ago
3
salgo60/Wikidata_riksdagen-corpus #62

SPARQL-endpoint till Riksdagens öppna data

En SPARQL-endpoint hos Riksdagens data är önskvärt. SPARQL integrerar enkelt i Wikipedia med att listor etc, kan skapas som uppdateras se äldre [test med Nobelprize.org](https://www.wikidata.org/wiki/…

salgo60 updated 3 days ago
3
biolab/orange3-text #1078

Preprocess text, Corpus or new separate widget: provide tool…

**Is your feature request related to a problem? Please describe.** When I'm working with a corpus that is a mixture of documents in American English and British English spelling, the two versions of …

wvdvegte updated 3 weeks ago
1
beir-cellar/beir #135

Allow option to disable progress bar when using `GenericData…

Right now, when using the data loader: ```python corpus, queries, qrels = GenericDataLoader(data_dir).load(split=split) ``` Tqdm will always show up. there should be a way to disable it, e.g.: …

xhluca updated 3 months ago
2
taers232c/GAMADV-XTD3 #433

Vault count errors not reported correctly

When getting vault counts for a large number of users with large mailboxes using gam 6.80.11, the errors come back as a separate json rows, and the rows with the email shows a count of 0 (which is not…

archierosenblum updated 2 weeks ago
7
nltk/nltk #3317

KneserNeyInterpolated taking an unreasonable amount of time …

`KneserNeyInterpolated.generate()` takes too long to run. Consider the following example: ```python from nltk.corpus import brown from nltk.lm.preprocessing import padded_everygram_pipeline f…

owo updated 1 week ago
3
lmullen/legal-modernism #94

Create data for missing textbooks

lmullen updated 1 week ago
3

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for corpus-data

1000+ results
for corpus-data