Closed xxgt closed 4 months ago
It is unclear if this is a bug or a convention. The implementation in the original sigpyproc was a direct port of the implementation in sigproc (see https://github.com/SixByNine/sigproc/blob/master/src/readchunk.c).
There the 1-bit unpacker explicitly unpacks in little-endian order.
/* unpack the sample(s) if necessary */
switch (nbits) {
case 1:
fread(&c,1,1,input);
for (i=0;i<8;i++) {
f[i]=c&1;
c>>=1;
}
ns=8;
break;
The sigproc documentation states:
Multi-byte precision data are written in different orders depending on your machine’s operating system. The original WAPP data, for example, was written on a PC (little endian format). The filterbank program knows about this and automatically does any byte swapping required while reading. When it comes to writing the data out, however, the program will always write data in the native order of the processing machine. To swap the bytes around before writing for use on other machines, use the-swapout option.
The endianness of the data depends on the recording instrument and it should probably be more explicitly set when reading the files.
This is a convention so that we have the same bitorder for low-bit data. Looks like this need not be true.
So, the sigproc
codebase follow different bitorder convention for 1-bit data? There is no header keyword defined to infer the bitorder from the filterbank file. The -swapout
option only controls the writing convention.
I think using endianness as a parameter flag (bitorder
) for these read/write functions and having both kind of unpacking implementation can be a good strategy. The flag would then be irrelevant if nbits > 8
.
FilReader.read_block(), FilReader.read_plan(), Header.prep_outfile()
Ideally, it should have been a header flag.
Need to reverse/make changes in https://github.com/telegraphic/numbits/issues/6 and https://github.com/telegraphic/numbits/pull/7
bitorder
is more correct than endianness
here so it makes sense to use that terminology.
As a note, the previous discussion about unpack1_8
vs np.unpackbits
is no longer valid after PR #34. The previous performance discrepancy was due to the intermediate buffer allocation.
In [72]: %timeit kernels.unpack1_8(in_ar, out_ar)
563 µs ± 2.26 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
In [73]: %timeit np.unpackbits(in_ar)
646 µs ± 9.54 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Hi, I found that this repository unpacks 1-bit data in big endian order, however ewanbarr/sigpyproc reacts differently in this case: https://github.com/ewanbarr/sigpyproc/blob/54a804200723d30601026be5bfa37ec90c8266c1/c_src/libSigPyProc.c#L23-L27 I wonder if this is meant to be?