issues
search
liyongsea
/
parallel_corpus_mnbvc
parallel corpus dataset from the mnbvc project
Apache License 2.0
8
stars
5
forks
source link
Rule-based detector and script for gpt segmentation
#24
Closed
voidf
closed
1 year ago
voidf
commented
1 year ago
加了个容错率更高的compare_breaks_v2,方便人工标注(以及直接编辑GPT标注来快速做手标数据)
规则型分段器,因为实现代码过于冗长换了一个文件装
一堆围绕gpt的请求以及后处理相关的代码