lemmatization Search Results

1000+ results
for lemmatization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

MIND-Lab/OCTIS #27

Improve Preprocessing Speed

Preprocessing currently takes a long time for large datasets. One way to improve the speed is to use [Spacy pipes](https://spacy.io/usage/processing-pipelines), particularly for lemmatization. Preproc…

aneesha updated 1 year ago
4
JasonNunez/Jaiac #1

Data Processing

Task: Perform initial data cleaning, including handling missing values (if any), normalizing text (lowercasing, removing punctuation, etc.), and preliminary data exploration (e.g., distribution of cla…

JasonNunez updated 7 months ago
1
sebastian-nehrdich/byt5-sanskrit-analyzers #1

Max length for segmentation-lemma-tagging?

I've tried `segmentation-lemma-tagging/run_inf.py` with various modes on the following sentence: > āsīdaśeṣanarapatiśiraḥsamarcitaśāsanaḥ pākaśāsana ivāparacaturudadhimālāmekhalāyā bhuvo bhartā pra…

aso2101 updated 3 weeks ago
3
digirati-co-uk/dlcs-search-service #13

Non functional requirement: support via unicode for non-Roma…

The service must support non-Western languages and scripts, including, but not limited to: * [ ] arabic * [ ] chinese * [ ] cyrillic

mattmcgrattan updated 5 years ago
1
mattfredericksen/CSCE-4205-ML-Project #4

Create a completely preprocessed document file

Now that we have our preprocessing (lemmatization, punctuation removal, etc) complete, we need to preprocess all of our input data. The code for this is simply `data['reviewText'] = data['reviewText']…

mattfredericksen updated 3 years ago
1
allenai/taggers #26

LemmatizedKeywordTagger does not work for all words

When the sentence tokens are lemmatized, they are lemmatized with the postag and the token. However, when the list of keywords is lemmatized, a postag is not available. Sometimes the two do not matc…

schmmd updated 10 years ago
1
mcthulhu/jorkens #3

Few questions about usage

Hi, nice project. Actually I began to do something similar when I encounter your project and doubt whether continue now on my own or join forces with you. Can you please explain how do I suppose to tr…

anatoly314 updated 4 years ago
3
LuteOrg/lute-v3 #433

Add ASBplayer support

I've posted this feature request to ASBplayer's GitHub as well. [Link to open issue](https://github.com/killergerbah/asbplayer/issues/428) **Is your feature request related to a problem? Please des…

Mycheze updated 5 months ago
1
adbar/simplemma #3

Additional inflection data for RU & UK

Hi, I'm the author of [SSM](https://github.com/FreeLanguageTools/ssmtool) which is a language learning utility for quickly making vocabulary flashcards. Thanks for this project! Without this it would…

1over137 updated 1 year ago
25
LangStream/langstream #514

support selection of similarity metrics for diversity and re…

**Background** - The literature seems unclear on what similarity metrics perform best for diversity and relevancy. (if anyone has found any good analysis on this would be great to see). - bm25 wor…

acantarero updated 1 year ago
3

上一页 1...9 10 11 12 13 14 15...100 下一页

1000+ results for lemmatization

1000+ results
for lemmatization