Unable to reproduce the paper results

I tried the code provided in this GitHub repo with the same parameter settings as given in the paper on the Rotten Tomatoes dataset. However, I am not able to reproduce the paper results.

For example, it is shown in Table 1 that with a temperature value of 0.85 and using the Beam search decoding method, the Test-set attack success rate achieved was 85.5% but I only got 17.27%.

Can you please provide the code so that we can reproduce the paper results?

puzzler10 / constraint_enforcing_reward

Unable to reproduce the paper results #3