Closed peter-madigan closed 3 years ago
To get a sense of the speed up,
writing 100000 single packet messages (eq. to writing 200000 packets)
In [1]: from larpix.format import rawhdf5format; msgs = [b'header\x00\x00'+b'bodydatadatadata']*100000; rawhdf
...: 5format.to_rawfile('test.h5',msgs=msgs)
In [2]: %timeit rawhdf5format.to_rawfile('test.h5',msgs=msgs)
746 ms ± 81.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
so ~268000 packets / second
writing 10 length 10000 packet messages (eq. to writing 100001 packets)
In [3]: msgs = [b'header\x00\x00'+b'bodydatadatadata'*10000]*10
In [4]: %timeit rawhdf5format.to_rawfile('test.h5',msgs=msgs)
57.2 ms ± 4.66 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
so ~1754000 packets / second
compared to before, writing 100000 packets
In [2]: import larpix; from larpix.format import hdf5format; packets = [larpix.Packet_v2()]*100000; hdf5format
...: .to_file('test1.h5',packets)
In [3]: %timeit hdf5format.to_file('test1.h5',packets)
5.43 s ± 178 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
so ~18000 packets / second
that's a factor of 100
Ok so this is a big PR, but here's the summary:
Drops python2 support to use a more recent h5py version
Extends
hdf5format
to usemultiprocessing
for the slow conversion ofPacket
objects to numpy arraysAdds a cached
int
representation forPacket_v2
objects to speed up attribute accessIntroduces a new
rawhdf5format
for writing raw bytes to diskImplements a means of bypassing the packet interpretation and write directly to the
rawhdf5format
from thePACMAN_IO
class