Noisy data generated by back translation

Very interesting work and thanks for sharing the code！

I am very interested in translation-based augmentation. I have generated some examples by running the run.sh, but some noisy ones are found and listed as follows:

(1) in forward generation; the input "could i get the address , phone number , and postcode of yu garden ?" and the output "The hotel is small location, the location is ideal and the food is fantastic.",

(2)in forward generation; the input "hi , i 'm looking for a nice german restaurant ." and the output "I was at listening to my room and we were even coming in the main area from 9 weeks. I also liked this hotel, this is a great boutique hotel."

(3)in forward generation; the input "i do n't care ." and the output "Sinon pour la plupart, je ne pense pas qu'il y ait un tel problème qui se pose à vous. Je n'ai pas l'intention de le faire."

Do you have any suggestions to avoid these errors?

Thanks!

google-research / uda

Noisy data generated by back translation #33