Closed nmweizi closed 3 years ago
Please read the docstrings of export_hdf5
. Here is the key part:
:param str group: Write the data into a custom group in the hdf5 file.
:param str mode: If set to "w" (write), an existing file will be overwritten. If set to "a", one can append additional data to the hdf5 file, but it needs to be in a different group.
So an example would be
for i in range(3):
df = vaex.example()
df.export_hdf5('./tmp.hdf5', mode='a', group=str(i))
But then when opening the file you must specify the group:
# for example
vaex.open('./tmp.hdf5', group='1')
If your goal is to convert a database into hdf5 so you can better work with vaex, it is easier (and recommended) to export each chunk to disk, then concatenate all those dataframes and export to a single file (you don't have to, but gives a bit better performance). This process is described in more detail elsewhere on this issue board.
If you wanna continue with a single hdf5 file following my example (i assume your original idea), it will require some more custom code before you are able to use all the data. Which is perfectly fine if you wanna go that route.
I hope this helps!
thank you very much. @JovanVeljanoski
Software information
import vaex; vaex.__version__)
: {'vaex': '4.5.0', 'vaex-core': '4.5.1', 'vaex-viz': '0.5.0', 'vaex-hdf5': '0.10.0', 'vaex-server': '0.6.1', 'vaex-astro': '0.9.0', 'vaex-jupyter': '0.6.0', 'vaex-ml': '0.14.0'}