sangoma / switchy

async FreeSWITCH cluster control
https://switchy.readthedocs.io/en/latest/
Mozilla Public License 2.0
69 stars 18 forks source link

Move away from hdf5 #23

Closed goodboy closed 8 years ago

goodboy commented 8 years ago

Consider dropping hdf5 or at least providing alternative storage backends. This is mostly inspired by http://cyrille.rossant.net/moving-away-hdf5/.

goodboy commented 8 years ago

I've recently run into concurrency problems with the latest libhdf5/PyTables as well.

The measurement storage subsystem currently uses a separate process to consume rows pushed by apps (such as the cdr app onto a queue. The child process then reads and writes them to the chosen storage back-end. PyTables already doesn't work that well with concurrency . I've found that with the current DataStorer I have to reopen the actual hdf file for every read due what appears to be a file lock (verified in switchy's unit tests by not including the self._store.open('r') on each access which results in not being able to read the underlying hdf table from the parent process).

Additionally with the latest version of Pytables 3.2.2 (which relies on hdf5-1.10.0-1 on archlinux) it seems that libhdf5 won't even unlock the file resulting in PyTables throwing an error such as:

IOError: HDF5 error back trace                                 
  File "H5F.c", line 579, in H5Fopen unable to open file                                        
  File "H5Fint.c", line 1168, in H5F_open                      
  unable to lock the file or initialize file structure       
  File "H5FD.c", line 1821, in H5FD_lock                       
  driver lock request failed                                 
  File "H5FDsec2.c", line 939, in H5FD_sec2_lock               
  unable to flock file, errno = 11, error message = 'Resource temporarily unavailable'                                                      
  End of HDF5 error back trace                                   
  Unable to open/create file '/tmp/tmpax4mwe_switchy_data.h5'    

If I stick to Pytables 3.2.1.1 (which on arch relies on hdf5-1.8.14-1) then I can at least get it all to work with the current implementation (i.e. the reopening on access).

So, for now, it looks like we're stuck on tables==3.2.1.1 if using hdf5.