gotutiyan / gector

Unofficial PyTorch implementation of "GECToR -- Grammatical Error Correction: Tag, Not Rewrite"
MIT License
13 stars 0 forks source link

内存问题 #1

Closed ZTurboX closed 9 months ago

ZTurboX commented 11 months ago

用这个项目cofe-ai/fast-gector训练,会出现cpu内存持续增加直至爆掉的情况,请问您这边有遇到吗

gotutiyan commented 11 months ago

In my experiences with our implementation, there has been no such problem.

ZTurboX commented 11 months ago

support chinese?

gotutiyan commented 11 months ago

No. With an appropriate BERT model, the model definition and training scripts could be used in any language.

However, since different languages use different tags, new pre-processing and post-processing scripts are required. Specifically, for pre-processing, Chinese version of grammarly/gector/utils/preprocess_data.py is needed. For post-processing, please rewrite edit_src_by_tags() function in gotutiyan/gector/predict.py appropriately.

I don't know the detail of Chinese tagging GEC, but this paper could be useful.