-
There are several pre-trained models that NLTK provides and it is unclear
- what the models are trained on
- how the models are trained
These pre-trained models include:
- `sent_tokeniz…
-
Excuse me, how does the entity labeling be?
-
Currently, there is no way in the UD English treebanks to differentiate between adjectives that refer to common nouns and those that refer to proper nouns -- both are annotated as `ADJ+JJ`.
This ma…
-
Post questions here for this week's fundamental readings:
Grimmer, Justin, Molly Roberts, Brandon Stewart. 2022. Text as Data. Princeton University Press: Chapters 5,7,9,11,16 —“Bag of Words”, “Th…
lkcao updated
8 months ago
-
环境
Elastic 版本 : `7.9.3`
Kibana 版本: `7.9.3`
何时报错
在使用 hanlp 提供的 data-for-1.7.5 完整数据包,调用分词器创建索引时,抛出异常。使用此插件默认数据包不报错
报错异常
```php
Uncaught Elasticsearch\Common\Exceptions\ServerErrorResponseEx…
-
need indexing - consider:
http://java.dzone.com/news/lucenes-fuzzyquery-100-times
-
@aaroncp1an0 I've started some code for XML parsing, check the xml_parser directory. I'll aim to finish this up later today, and we should have a TSV file for sequence feature attributes soon.
-
Words with misspelled "е\ё" are not corrected with spellchecker (USSR typographical simplification allows this), however, such words are not detected with part-of-speech analysis.
ежик - -
ёжик …
-
####(1)
@phoenix-mossimo and I noticed that, as it is now, all ANNIS links look for "lemma" (based on the pure form/string) in ANNIS:
TLA "form" => ANNIS "lemma",
Wouldn't it make more sense to **l…
-
The definition of [AUX](https://universaldependencies.org/u/pos/AUX_.html) in UD is "An auxiliary is a function **word** that accompanies the lexical verb of a verb phrase and expresses grammatical di…