fastnlp / fastNLP

fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.
https://gitee.com/fastnlp/fastNLP
Apache License 2.0
3.07k stars 448 forks source link

from_dataset ignore null instance #454

Open MorningForest opened 1 year ago

MorningForest commented 1 year ago

Description:简要描述这次PR的内容 vocaburary的from_datasets支持某个instance存在空字符串。会跳过并打印warning信息。增加对应的测试用例

Main reason: 做出这次修改的原因

Checklist 检查下面各项是否完成

Please feel free to remove inapplicable items for your PR.

Changes: 逐项描述修改的内容

Mention: 找人review你的PR

@修改过这个文件的人 @核心开发人员