This is a function that implements functionality similar to a chained FilReader.readBlock().dedisperse()[:, :valid_samples] call, saving on memory in all non-DM=0 cases and I/O in some cases.
It offers two reading modes, controlled by the "small_reads" kwarg. In the True case (and dtype is longer than 8 bits), this will only perform disk I/O to get the specific frequency channels needed from a specific time sample. In the False case, this reads an entire sample and extracts the frequency channels of interest.
Possible caveats: I suspect that this will not work in the nbit < 8 case if the product of nbits and nchans is not evenly divisible in byte space.
Testing this code on the I-LOFAR REALTA nodes (HDD based), I see a 60% reduction in execution time between readDedispersedBlock and the previous chained method for long reads with a relatively high DM:
In [22]: %timeit data1 = reader.readDedispersedBlock(195349812, abs(195350012 - 195353724) + 200, 56.751)
23.1 s +- 706 ms per loop (mean +- std. dev. of 7 runs, 1 loop each)
In [23]: %timeit data2 = reader.readDedispersedBlock(195349812, abs(195350012 - 195353724) + 200, 56.751, smallReads = False)
23.5 s +- 101 ms per loop (mean +- std. dev. of 7 runs, 1 loop each)
In [24]: %timeit rawdata1 = reader.readBlock(195349812, delays[-1] + abs(195350012 - 195353724) + 200, 56.751); data3 = rawdata1.dedisperse
...: (56.751)[:, :abs(195350012 - 195353724) + 200]
1min 6s +- 14.5 s per loop (mean +- std. dev. of 7 runs, 1 loop each)
With the data produced being the exact same:
In [30]: np.sum(np.logical_not(np.logical_and(data3 == data2, data3 == data1)))
Out[30]: FilterbankBlock(0)
I'll say that ths is actually slower than another methodology I'm preparing for a serparate PR at the moment (readBlock + dedisperse -> only_valid_samples), though this function could still be useful for those working in memory-constrained enviroments.
By the way, cheers for the work in porting sigpyproc to python 3.
This is a function that implements functionality similar to a chained FilReader.readBlock().dedisperse()[:, :valid_samples] call, saving on memory in all non-DM=0 cases and I/O in some cases.
It offers two reading modes, controlled by the "small_reads" kwarg. In the True case (and dtype is longer than 8 bits), this will only perform disk I/O to get the specific frequency channels needed from a specific time sample. In the False case, this reads an entire sample and extracts the frequency channels of interest.
Possible caveats: I suspect that this will not work in the nbit < 8 case if the product of nbits and nchans is not evenly divisible in byte space.
Testing this code on the I-LOFAR REALTA nodes (HDD based), I see a 60% reduction in execution time between readDedispersedBlock and the previous chained method for long reads with a relatively high DM:
With the data produced being the exact same:
I'll say that ths is actually slower than another methodology I'm preparing for a serparate PR at the moment (readBlock + dedisperse -> only_valid_samples), though this function could still be useful for those working in memory-constrained enviroments.
By the way, cheers for the work in porting sigpyproc to python 3.