tensorflow / ecosystem

Integration of TensorFlow with other open-source frameworks
Apache License 2.0
1.37k stars 391 forks source link

Possible reason that may fail 'Propagate' - 'write data locally' in LocalWriteSuite test #164

Open simonzhaoms opened 4 years ago

simonzhaoms commented 4 years ago

https://github.com/tensorflow/ecosystem/blob/791a42f427139f2cb7bbae8a93a9e666a38c2bcc/spark/spark-tensorflow-connector/src/main/scala/org/tensorflow/spark/datasources/tfrecords/DefaultSource.scala#L173-L215

The check if (dir.exists()) in line 179 above may cause subsequent partition write failed if partitions are more than 2 in the test below. Because partitions are written in a map in line 211 above, subsequent partition writes would fail when checking if (dir.exists()).

https://github.com/tensorflow/ecosystem/blob/791a42f427139f2cb7bbae8a93a9e666a38c2bcc/spark/spark-tensorflow-connector/src/test/scala/org/tensorflow/spark/datasources/tfrecords/LocalWriteSuite.scala#L42-L70

The exception thrown should be similar to https://github.com/tensorflow/ecosystem/pull/141#issuecomment-542353106