Closed lowblung closed 5 years ago
I don't see _NC_TRAIN_DATASETS being defined anywhere? Also the problem seems to be in the way you are trying to create the data, i.e. in the line:
target_datasets = [[item[0], [item[1][1]]] for item in train_dataset]
it is either not finding item[0] unlikely, or item[1] also unlikely, or item[1][1] most likely -- you can try to run in code itself in the python interpreter and try again?
In the worst case, you can use Text2TextTmpDir problem and make line by line data and use it to train.
I have added the link as the item[0] to run. It is kind of "working" to run the tensor2tensor.
_IPO_TEST_DATASETS = [[ "https://s3-us-west-2.amazonaws.com/twairball.wmt17.zh-en/cwmt.tgz", ["dev.final.en","dev.final.zh"] ]]
If I understand it is working now? Let me know if it isn't working. Closing this now, feel free to re-open if there are issues, ok?
Description
I will to use my own set of data, but it is not working. Could anyone can help with this issue?
Environment information
Steps to reproduce:
USR_DIR=$HOME/t2t_usr PROBLEM=translate_ron8k DATA_DIR=$HOME/t2t_data
TMP_DIR=$HOME/tmp/t2t_datagen
mkdir -p $DATA_DIR $TMP_DIR $USR_DIR
t2t-datagen \