-
Just putting this here so it doesn't get forgotten. We should look into the process of 'Stemming' which remove prefixes and suffixes from words eg. 'ing'. 'ed' etc....
-
I m trying to use GreTa following the commands here https://huggingface.co/bowphs/GreTa , but it does not work. `AutoModelForConditionalGeneration` seems to have been substituted with `T5ForConditiona…
-
### context
I'm looking to get the original token positions of keyterms when performing keyterm extraction with e.g. TextRank, but this can apply to the other extractors. Example:
```python
>>> d…
-
Create a filter/function to group identical analyses into a single entry. For example, analyses `18` and `19` of `forma` (Du Cange) are identical:
```
============================ANALYSIS 18======…
-
I have been working with natural language processing and often needed to know which words were used in certain corpora. Many dictionaries are comprised of word stems, requiring the extraction of stems…
-
hmm i dont want to write, i write enough at work. so im gonna pull together some unified search table and attach a labeling system. labels will be able to filter search results--this is a data-agnosti…
-
Currently, TC's text capabilities are limited to using logistic regression on top of BOW encoded text.
While this is suitable for some cases, many use cases require more sophisticated/modern NLP me…
-
1. lemmatize the ukwac+wacky corpus using Jobimify tool:
```
frink:/home/panchenko/jobimify
```
- use the concatenation of these corpora http://cental.fltr.ucl.ac.be/team/~panchenko/d…
-
I would like to make a service for integrating Apertium morphological analyzers, which provide tokenization, lemmatization, XPOS, UPOS, and FEATS in a single operation. While I certainly can split thi…
-
@kriteshsharma
Do some research and show here the report