ImKeTT / ZeroGen

[NLPCC'23] ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles PyTorch Implementation
https://arxiv.org/abs/2306.16649
11 stars 0 forks source link

About hyper-parameters #3

Open fukuzawa-e opened 3 months ago

fukuzawa-e commented 3 months ago

python run_zerogen.py --alpha ${ALPHA} --beta ${BETA} --eta ${ETA} --k ${K} --condition_method add \ --task ${TASK} --decoding_len ${LENGTH} --alpha_scale --alpha_activasize ${ALPHA_HAT} \ --beta_scale --beta_activesize 0.2 --beta_upper ${BETA_HAT} --n_obj ${N} --kw_mode max --k2t

By the above command, I want to test making long output, for example more than 100 words. I changed --decoding_len by 100, but the length of the results is not changed so different, the results are still short. Thus, I tried to only set --decoding_len by 100, and the longer results can be output, but quality of the result text is not so good. I want to confirm effective of each parameter, but there are no details in the paper, is there any documents that can explain mean of each parameter ?

ImKeTT commented 3 months ago

Thank you for your interest in our work. For generating high-quality long controllable texts, you need to fine-tune the base LM on textual corpus with longer length rather than simply twiching the decoding_len parameter.

fukuzawa-e commented 3 months ago

Thank you so much for your reply. About fine-tune the base LM, do you use SimCTG for fine-tuning new LM model ?

ImKeTT commented 3 months ago

Yes, I employed this codebase for fine-tuning the base LM.