Open coco-lab-2022 opened 2 years ago
Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision. EMNLP(2021) method: Propose a self-supervised method for CWS dependency parsing datasets: MSRA、PKU、AS、CITYU、CTB、SXU、CNC、UDC、ZX code: https://github.com/miradel51/Self_Supervised_CWS paper: https://aclanthology.org/2021.emnlp-main.158.pdf
A Graph-based Model for Joint Chinese Word Segmentation and Dependency Parsing. ACL(2020) method: Propose a graph-based model for joint Chinese word segmentation and dependency parsing datasets: CTB5, CTB7, CTB9 code: https://github.com/fastnlp/JointCwsParser paper: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00301/43541/A-Graph-based-Model-for-Joint-Chinese-Word
Joint Chinese Word Segmentation and Part-of-speech Tagging via Two-way Attentions of Auto-analyzed Knowledge. ACL(2020) method: Propose neural approach with a two-way attention mechanism to incorporate autoanalyzed knowledge for joint CWS and POS tagging, following a character-based sequence labeling paradigm. datasets:CTB5、CTB6、CTB7、CTB9、UD code: https://github.com/SVAIGBA/TwASP. paper: https://aclanthology.org/2020.acl-main.735.pdf
Improving Chinese Word Segmentation with Wordhood Memory Networks. ACL(2020) method: Propose WMSEG, a neural framework for CWS using wordhood memory networks. datasets: MSR、PKU、AS、CITYU、CTB6 code: https://github.com/SVAIGBA/WMSeg. paper: 2020.acl-main.734v2.pdf (aclanthology.org)
BERT+LTL
A joint multiple criteria model in transfer learning for cross-domain chinese word segmentation. EMNLP(2020) Kaiyu Huang, Degen Huang, Zhuang Liu, and Fengran Mo
datasets: MSRA、PKU、CTB、SXU、CNC、UDC、ZX code: https://github.com/koukaiu/dlut-nihao paper: https://aclanthology.org/2020.emnlp-main.318.pdf
datasets: PKU、AS、CITYU、MSR code: https://github.com/akibcmi/SAMS paper: https://arxiv.org/pdf/1910.14537.pdf
code: https://github.com/google-research/bert. paper: https://arxiv.org/pdf/1810.04805.pdf
datasets: PKU、AS、CITYU、MSR、CTB6、CTB7、UD code: NONE paper: https://arxiv.org/pdf/1808.06511.pdf
datasets: PKU、AS、CITYU、MSR code: https://github.com/jcyk/greedyCWS. paper: https://arxiv.org/pdf/1704.07047.pdf
For each task, can you upload a table for results comparison at the same time?
By the way, can you mention whether these methods have codes or not? @gezi-creator
Please create a table to include the recent algorithms and their performances on the major datasets.