tokenizing Search Results

1000+ results
for tokenizing

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

hyunwoongko/pecab #8

숫자, 외국어의 tokenizing 문제

숫자와 외국어의 token 인식이 문자단위로 이루어집니다. ```python from pecab import PeCab pecab = PeCab() pecab.pos('2023년, 드디어 python으로 한글을 분석합니다.') ``` ``` [('2', 'SN'), ('0', 'SN'), ('2', 'SN'), ('3', 'S…

tsbach updated 7 months ago
2
AndyIbanez/andyibanez-com #21

posts/tokenizing-nltokenizer/

# Tokenizing Natural Language into Semantic Units in iOS • Andy Ibanez [https://www.andyibanez.com/posts/tokenizing-nltokenizer/](https://www.andyibanez.com/posts/tokenizing-nltokenizer/)

utterances-bot updated 3 years ago
3
nlpyang/PreSumm #66

Why tokenizing 2 times ?

Data is tokenized 2 times : 1. With Stanford CoreNLP : https://github.com/nlpyang/PreSumm/blob/ba17e95de8cde9d5ddaeeba01df7cace584511b2/src/prepro/data_builder.py#L110 2. With HuggingFace's Bert…

astariul updated 2 years ago
5
VKCOM/YouTokenToMe #80

Tokenizing large corpus

Right now tokenizer loads whole corpus in memory and it becomes an issue for large files. Is it possible to read corpus file line-by-line or split it in any other way (while training as a whole)?

quetz updated 3 years ago
2
RDFLib/rdflib #2946

Removal of specialized HTML literal handling?

Possible easy solution for #2935 and #2945 The reason we forked `html5lib` to make `html5lib-modern` was because there is no new replacement for `html5lib` that provides the same XML-based HTML-tok…

ashleysommer updated 2 weeks ago
6
v79/Khartoum #5

Improve text layout engine/drawing options

Add further text layout options: - centre and right-align text - Make wrap and truncate better - Add word-splitting or tokenizing to improve wrapping -

v79 updated 2 months ago
1
com-lihaoyi/PPrint #25

Blacklining while tokenizing strings

I think it would be extremely helpful if we'd have a way to do blacklining on case classes, or collections to see which of the parameters are different. For example `case class Hello(arg: String)` …

mihaisoloi updated 5 years ago
1
sloria/TextBlob #109

Tokenizing sentences in quotations

I love TextBlob, thank you so much for making this awesome Python tool :+1: I am wondering if there is a solution to a tokenization issue I'm seeing. Here's some example code with an excerpt from G…

dagrha updated 7 years ago
1
oss-slu/Enhancing-Bioinformatics-Research-through-LLM #2

Collect and preprocess a small sample dataset of bioinformat…

Identify relevant sources for the dataset (e.g., open-source bioinformatics projects, research papers). Preprocess the data by tokenizing, removing unnecessary characters, and formatting for LLM inp…

AjithAkuthota23 updated 2 months ago
1
MahApps/MahApps.Metro #4401

New Control: Tokenizing / Tags

## Describe the feature New control to add tokens / tags (See screenshot). Could be helpful for different scenarios. Mainly for tagging. The items should be bound to a collection. @punker76 w…

BornToBeRoot updated 1 year ago
3

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for tokenizing

1000+ results
for tokenizing