Closed workingloong closed 5 years ago
The dtype of column in odps table may be int32, float, boolean, string and so on. So, the records_output_types in ODPSDataReader can not be fixed to tf.float32. https://github.com/sql-machine-learning/elasticdl/blob/aef9d66e5d99ed2144ed4ba36933aaf85738a327/elasticdl/python/data/data_reader.py#L143-L144
I suggest that, records_output_types in ODPSDataReader is fixed to tf.string. And, we should convert data from odps_io.ODPSReader.read_batch to string. https://github.com/sql-machine-learning/elasticdl/blob/aef9d66e5d99ed2144ed4ba36933aaf85738a327/elasticdl/python/data/odps_io.py#L223-L225
batch_record.append( [str(record[column]) for column in columns] )
Then, user can cast the string to data type they want in defined dataset_fn
@workingloong Yes, float32 was only the temporary plan. We should switch to something more robust.
float32
The dtype of column in odps table may be int32, float, boolean, string and so on. So, the records_output_types in ODPSDataReader can not be fixed to tf.float32. https://github.com/sql-machine-learning/elasticdl/blob/aef9d66e5d99ed2144ed4ba36933aaf85738a327/elasticdl/python/data/data_reader.py#L143-L144
I suggest that, records_output_types in ODPSDataReader is fixed to tf.string. And, we should convert data from odps_io.ODPSReader.read_batch to string. https://github.com/sql-machine-learning/elasticdl/blob/aef9d66e5d99ed2144ed4ba36933aaf85738a327/elasticdl/python/data/odps_io.py#L223-L225
Then, user can cast the string to data type they want in defined dataset_fn