-
…by Stanford, https://nlp.stanford.edu/projects/histwords/
*We released pre-trained historical word embeddings (spanning all decades from 1800 to 2000) for multiple languages (English, French, Germ…
-
Chinese texts need a special kind of tokenization. Their texts cannot be simply split by whitespace or characters. It would be nice to add a separate module for segmenting Chinese texts.
Option 1: …
-
提示缺失文件
Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory ./bert-base-chinese.
需要将这几个文件也都下下来吗?
-
Hi, my collegues and I have released [UD-Kanbun](https://github.com/KoichiYasuoka/UD-Kanbun), a python-based tokenizer, POS-tagger, and dependency-parser for classical Chinese texts. And now we are in…
-
请问权重文件哪里可以获得?
-
Hi,
I uploaded this file and successfully generated the TFIDF file, and now I want to use formatA called in my QA data format:
python generate.py / path / to / dataset / dir dataset / path / to / ou…
-
```python
import json
import argparse
from fengshen import UbertPipelines
import torch
total_parser = argparse.ArgumentParser("TASK NAME")
total_parser = UbertPipelines.pipelines_args(total_…
-
I am experimenting this code on East Asian languages such as Chinese or Japanese, which do not have apparent word boundary like the white space in Latin languages. Looking into the code I found that b…
-
Hi Wenqiang,
Your project is perfect to me. you provide a gif demo example of how to analyze comments of a Weibo about Eddie Van Halen's death. It makes me clearer about the process. Second, in yo…
-
I've almost finished to build up [UD_Classical_Chinese-Kyoto](https://github.com/UniversalDependencies/UD_Classical_Chinese-Kyoto/tree/dev) Treebank, and now I'm trying to make a Classical Chinese mod…