Sachin19 / mucoco

Official Code for the papers: "Controlled Text Generation as Continuous Optimization with Multiple Constraints" and "Gradient-based Constrained Sampling from LMs"
MIT License
59 stars 6 forks source link

Need more instructions to run the code #2

Open dhx20150812 opened 2 years ago

dhx20150812 commented 2 years ago

Hi, @Sachin19, thanks to share your interesting work.

I think the idea of continuous optimization for controllable text generation is great, but when I'm going to run your code, I find there is no dataset and model checkpoint available.

I noticed that there are some code comments in the file decode_example.sh, #L29-21. But I still have no idea how to download the model checkpoint and dataset?

It would be grateful if you could add more instructions on how to download your training data and primary model(or classification model).

MHDBST commented 2 years ago

I have the same issue. Is there any workaround?

Sachin19 commented 2 years ago

Hi, thanks for your interest in this code! Apologies for not updating the checkpoints yet, I am currently in the process of reorganizing the code with some additional experiments. This will take a few weeks after which I plan to include all the trained models and more instructions in the repository.

All the experiments we run are done on top of hugging face models or sentence-transformer models. In the meantime, to train your own constrained models,

(1) For huggingface-based models (classifiers), you can follow the instructions here (and dataset and hyperparameter details from the paper). (2) For sentence transformer models (similarity models), you can follow the instructions here (by fine-tuning a GPT2 based model).

For both cases, I ran the exact codes from these links with the hyperparameter details provided in the Appendix of the paper.

Hope that helps.

MHDBST commented 2 years ago

Thanks for the clarification. My main question is about the pertained models you used for each task. Could you please let me know where I can download the following models that you used for your training?

PRIMARYMODEL=path/to/primary/model 
STSMODEL=path/to/usim/model
CLASSIFIERMODEL=path/to/formality/classifier