Closed danielhrisca closed 6 years ago
Hi Daniel, Thanks for pointing out. It seems blosc compression is relatively odd; data are not same after decompression, giving 1E13 max for time channel -->arange is allocating too much memory. Bcolz has poor performance or even worse with vectors and now blosc is actually dodgy, I am like you tempted to give up on compression... Maybe just a zlib would be more relyable.
Blocks is advertised as lossless, so maybe there is some other reason for the odd value.
Yes, maybe but seems rather either numpy or blosc bug. When I use self.data = compress(a.tobytes()) for compression and fromstring(decompress(self.data), dtype=self.dtype) for decompression, I have the correct data back. compression pointer is advised for speed but it seems a bit risky. Either numpy does not show correct pointer for __array_interface__ or bug in blosc. Anyay I made a quick patch that seems to be working in last commit.
Trying to merge the two test files on 32 bit python raises a MemoryError
Hi Daniel, Can you detail a bit more the script ? I tried to do the followin but could not reproduce MemoryError
x1=mdf('test.mf4', compression='blosc)
x1.resample(0.01)
x2=mdf('tests.mdf',compression='blosc')
x2.resample(0.01)
x1.mergeMDF(x2)
Maybe you simply ask too much to your machine (I have 16Gb, 64bit python 3.5.3)
I mentioned that the error occurs on 32 bit Python, not 64 bit Python.
I could read it, but I do not have a 32bit OS to investigate easily, I will have to setup a virtual machine and so on... I will try to check how much memory it is allocating. googling a bit shows numpy could be limited in 32bit to 3.2Gb or worse 2Gb depending of compilation.
On Windows you could simply use a 32 bit WinPython distribution
It seems the optimizations done since this issue was raised have lowered the ram usage enough to avoid a memory error with the test files.
gives