corpus-analysis Search Results

1000+ results
for corpus-analysis

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

CIRCSE/LEMLAT3 #6

Common words and forms missing from LEMLAT

We have tested LEMLAT on a corpus of classical Latin texts from a university reading list. The corpus contains some 23,700 words and 8,538 different word forms: Terence's Adelphoe, Horace's Odes Bk. 1…

nevenjovanovic updated 4 years ago
2
google/bloaty #378

bloaty shows 'region out-of-bounds' error on ELF files with …

`bloaty` errors processing an ELF file where a 0-sized segment is not backed by the ELF file's contents: ``` $ bloaty bloaty: region out-of-bounds ``` This happens when `bloaty` is iterati…

kjteske updated 2 days ago
9
UChicago-Computational-Content-Analysis/Readings-Responses-2024-Winter #42

3. Clustering & Topic Modeling to Discover Higher-Order Patt…

Post your response to our challenge questions. First, write down three intuitions you have about broad content patterns you will discover in your data. Plan an asterisk next to the one you expect m…

lkcao updated 6 months ago
30
UChicago-Computational-Content-Analysis/Readings-Responses-2023 #41

3. Discovering Higher-Level Patterns - fundamental

Post questions here for this week's fundamental readings: Grimmer, Justin, Molly Roberts, Brandon Stewart. 2022. Text as Data. Princeton University Press: Chapters 10, 12, 6, 13 —“Principles of Discov…

JunsolKim updated 2 years ago
23
UChicago-Thinking-Deep-Learning-Course/Readings-Responses #15

Week 9 - Possibility Readings

Post a reading of your own that uses deep learning for social science analysis and understanding, with a focus on Solving Problems & Creating Digital Doubles - in this case, we want you to look for ex…

bhargavvader updated 3 years ago
10
singnet/language-learning #183

Add costs to learned LG rules (for ILE)

Need to add costs to rules based on statistics for better parsing with account to these costs. First, this has to be done for GL ILE algorithm and we see if it helps and then it may advanced to other …

akolonin updated 4 years ago
7
esmero/archipelago-deployment #62

Add needed infrastructure for NLP/ENTITY extraction in our d…

# What flavor of ice cream is AI? For Natural Language Processing and AI analysis of extracted Corpus of text from Files, metadata Description fields or similar textual bodies i started building a …

DiegoPino updated 4 years ago
1
facebookresearch/flores #26

FLORES-101 benchmark and Alternative Spelling rules in some …

Thanks for Open-Source The FLORES-101 Data Set. While working with him, I noticed a certain feature that I wanted to share here. Some languages contain Alternative Spelling rules therefore some words …

Fikavec updated 3 years ago
2
OpenNMT/Tokenizer #176

space/none mode potentiel issue with case_markup

When using `case_markup` in `space`/`none` mode, unexpected behavior happens: ```python >>> pyonmttok.Tokenizer("none", case_markup=True).tokenize("你好世界，这是一个Test。") ... (['｟mrk_case_modifier_C｠', …

Zenglinxiao updated 3 years ago
42
wkgcass/public-chat #10

Will LLM do word segmentation for Chinese?

/chat: Will LLM do word segmentation for Chinese? Or do they simply read each Chinese character and run the process?

wkgcass updated 1 year ago
11

上一页 1...66 67 68 69 70 71 72...100 下一页

1000+ results for corpus-analysis

1000+ results
for corpus-analysis