In the README for the seq2seq_attention_copy method, I was unable to understand what is the difference between the data in the folders data/datasets/data and data/datasets/data_radn_split
It is mentioned that we have to put the original data in these folders.
It seems to me that the folders data and data_randn_split have different data, else the experiments in attn_copying_tune_data_radn_split.yaml and attn_copying_tune_data.yaml would be equivalent. But how are they different? Is the original data in the spider dataset being split randomly into these 2 folders? If so, in what ratio should the split be - 50:50 or some other ratio?
As I understand from here, should the folders data and data_randn_split have their own train, dev and test json? What is the reason for having these 2 folders or 2 different kinds of data?
In the README for the seq2seq_attention_copy method, I was unable to understand what is the difference between the data in the folders
data/datasets/data
anddata/datasets/data_radn_split
It is mentioned that we have to put the original data in these folders.
It seems to me that the folders
data
anddata_randn_split
have different data, else the experiments inattn_copying_tune_data_radn_split.yaml
andattn_copying_tune_data.yaml
would be equivalent. But how are they different? Is the original data in the spider dataset being split randomly into these 2 folders? If so, in what ratio should the split be - 50:50 or some other ratio?As I understand from here, should the folders
data
anddata_randn_split
have their own train, dev and test json? What is the reason for having these 2 folders or 2 different kinds of data?