ghcollin / tftables

HDF5 interface for Tensorflow.
MIT License
74 stars 10 forks source link

Performance gain of ttables #7

Open guolu-home opened 6 years ago

guolu-home commented 6 years ago

Hi, I have implemented a hdf5 stream based on your document. I want to know the performance gain based on ttables in general? Because i do not see too much gain in my case, i.e., the speed of my example is same as before.

Thanks

ghcollin commented 6 years ago

Sorry for the late response. tftables is backed by multitables, which has some benchmark comparisons for a few different storage configurations. The best performance increase is seen when reading from a RAID0 SSD configuration.

With that said, the actual performance of your model will depend on a large variety of factors. If the model is very complex, then it is unlikely to be bottlenecked by the IO. Tensorflow also introduces its own overheads for uploading the data to the GPU, which can reduce performance when the input data is very large. Ultimately, I wrote tftables mainly as an easy to use interface for using HDF5 files with Tensorflow, with the secondary benefit that concurrent reads might relieve any performance penalty introduced by HDF5.