liyongsea / parallel_corpus_mnbvc

parallel corpus dataset from the mnbvc project
Apache License 2.0
7 stars 5 forks source link

[UN corpus] 下载联合国documents(doc) #48

Closed Wzixiao closed 7 months ago

Wzixiao commented 12 months ago

https://documents.un.org/

liyongsea commented 11 months ago
liyongsea commented 11 months ago

完成文件地址爬取 https://huggingface.co/datasets/dabaisuv/UN_Documents_2000_2023

liyongsea commented 11 months ago

@Wzixiao 看一下代码

liyongsea commented 10 months ago

https://gist.github.com/Wzixiao/53f6b3948dba4fcf657008058041cf97 -> 青禾,夜夜测试

liyongsea commented 10 months ago

文件压缩包我上传到百度云了

链接:https://pan.baidu.com/s/1fgQ05rP7Bn_la6Fvz1ipEg?pwd=al2r 提取码:al2r --来自百度网盘超级会员V4的分享