support Chinese? - Githubissues

LIN-SHANG / InstructERC

The offical realization of InstructERC

121 stars 7 forks source link

support Chinese? #3

Closed zhijiezhong closed 11 months ago

zhijiezhong commented 1 year ago

Dear, this is a great job, and I would like to ask if it supports Chinese dialogue emotion recognition.

LIN-SHANG commented 1 year ago

Thank you for your attention! Of course, it can support Chinese dialogue emotion recognition. As far as we know, both ChatGLM and LLaMA have Chinese vocabularies, but because the proportion of Chinese in the training corpus varies greatly, if you use my entire code process, I predict that the performance of ChatGLM will be significantly better than LLaMA-base. Of course, you can also use the Chinese-alpaca-base model for migration, as I wrote in the paper, InstructERC is a brand new Plug-and-Play framework, it is very easy to migrate, I hope your reconstruction on the Chinese ERC dataset goes smoothly!

zhijiezhong commented 1 year ago

I am very glad to receive your response. I will try ChatGLM on Chinese dataset.InstructERC plug and play nice framework, hopefully I can get better results on Chinese dataset. Thanks for the advice.Thanks!

zhijiezhong commented 1 year ago

您好！我在运行代码的过程中遇到了一些错误，直接运行，没有对路径进行修改（修改了一些路径也是下面的错误），发生了下图的错误，需要data_dir这个参数。运行环境Ubuntu，没有使用docker。请问可以详细说一下怎么做吗？比如需要修改data_process.py、main_new.py、data_utils.py这些文件中的哪些呢？ QQ图片20231117162319

LIN-SHANG commented 1 year ago

Thanks for your attention! You can take a good look at my readme.md file. I have highly engineered integration for different experiments. To start the training process, you should run the train_and_inference_Uni.sh file. Alternatively, you can refer to my bash files and write your own bash file to customize your training process as needed. main_new.py is just one part of the entire workflow.

zhijiezhong commented 1 year ago

Yes, I just ran the train_and_inference_Uni.sh file and had the above problem. I'd like to run it successfully on the public dataset first before trying to use the Chinese one.

LIN-SHANG commented 1 year ago

The message you've shared seems to be addressing an issue with a data directory not being set in the train_and_inference_Uni.sh. It suggests debugging in the data processing segment before entering the training process in main_new.py. It mentions that there's a step where data is processed and should be output to a specified folder. The message reassures that the code should be technically sound, as another researcher successfully replicated the code a week earlier. It encourages to proceed with debugging without too much stress.

zhijiezhong commented 1 year ago

OK!Thanks!