soyoung97 / fairseq-gec-korean

Other
0 stars 0 forks source link

error while running #1

Open Zari222 opened 11 months ago

Zari222 commented 11 months ago

Hello, I found your work interesting but couldn't run the code, can you please help me? I have attached the error I got. Also I don't know what device_id and experiment_name is. I would be very grateful if you could provide me with your advice image

soyoung97 commented 8 months ago

Hello! Really sorry for the late reply :( I've missed this issue.. Unfortunately, this repository is no longer maintained... Could you consider trying on a new version of my work? https://github.com/soyoung97/Standard_Korean_GEC

Or, if you want to specifically solve the problem of this code, please consider referencing the original code repository: https://github.com/kanyun-inc/fairseq-gec this code was mostly built from the above link. (same train.sh format at here) You may find some informative comments on the issues there.

Best regards, Soyoung Yoon

Zari222 commented 8 months ago

Hey no problem. Actually I saw the original code but couldn't still solve it. I will check your other repository and reach you if I had problems thank you. but a question here. Does your work only can be used for Korean or I can change it for other language too with only thousand of data?

soyoung97 commented 8 months ago

For your question: this repository is no longer maintained, so I'm replying based on my work on https://github.com/soyoung97/Standard_Korean_GEC:

If you want to train your model on GEC for different dataset, you can just follow the fine-tuning starter code, something like this. If you want to train in different language, consider using MT5.

If you want to conduct automatic error type analysis given source and target data (grammatically correct / incorrect text), then yes, my work is especially designed for Korean error type analysis. Please consider using ERRANT and modify the code to fit in your language. You may find this issue useful.

Hope this helps! Soyoung

Zari222 commented 8 months ago

So what I understood is your work is for error type analysis only and it is based on ERRANT and if i want to use it for another language I should change it based on the language I want and use the issue you mentioned. About fine-tuning and using MT5, Can you please describe more? I want to develop and train a model but don't know how. You mean with MT5 I can fine-tuning it for GEC task? Sorry if this issue is not relatable anymore, I will ask question in other repository or email you if there is a problem?!

soyoung97 commented 8 months ago

Please refer to my work about Standardizing Korean GEC. My work consists of (1) Making Korean GEC dataset (2) Automatic Error Type analysis and (3) Training the model with it (fine-tuning on top of KoBART). If you want to train your GEC model on new language, I think that would be a whole another topic. MT5 stands for multilingual T5. I'm sorry but I can't help you with code-level instructions of fine-tuning MT5. I'm sure there are a lot of resources for that by googling..!

Best regards, Soyoung

Zari222 commented 8 months ago

Thank you