Closed vmurahari3 closed 5 years ago
What would be a more convenient data format? We could add support for something more generic, like text files with json objects one per line
{'context/1': "Hello, how are you?", 'context/0': "I am fine. And you?", 'context': "Great. What do you think of the weather?", 'response': "It doesn't feel like February."}
{'context/0': "I am Matt", 'context': "Nice to meet you", 'response': "Nice to meet you too."}
That would be so wonderful. Json files will be amazing :)
I introduced JSON format in this PR:
https://github.com/PolyAI-LDN/conversational-datasets/pull/49
Just run add --dataset_format JSON
when calling create_data.py
Was wondering if you any ideas on parsing these tf.Record files in pytorch?