Compress raw LF data? - Githubissues

iceman1001 commented 5 years ago

40kb bigbuffer of LF signals, Usually in lf search, lf read, (data sample) it will benefit from compressing using the zlib we already have access to on armsrc. LF signals is highly repeating :)

doegox commented 5 years ago

Quick test on saved samples converted to bin then compressed with zlib:

40000 -> 10036 em40xx without saturation
40000 ->  6270 em40xx with saturation
40000 ->  4524 no tag
40000 ->  8847 indala
40000 -> 10644 presco without saturation
40000 ->  3435 presco with saturation

Yep there is some potential :)

iceman1001 commented 5 years ago

Worst case scenario is a noisy signal, with high entropy. But luckily LF signals isn't that random. The cleaner the signal the better compression :)

iceman1001 commented 5 years ago

Even flashmemory downloads will benefit.

There is some limits. We need to use a loss-less compression mode. and it would be nice if we could have a dynamic compression so we can send partial compressed data. Since we don't have much memory left if we gonna send 40kb bigbuffer and compress it. If we don't go over flashmem but that takes longer time I suppose.

iceman1001 commented 5 years ago

I added my attempt to this branch.

https://github.com/RfidResearchGroup/proxmark3/tree/bt_transfers

I hooked it up to

pm3-->analyse a

You will need to flash it with openocd in order to program bank1 correct.

flasher bug, can't write to bank1 correct
used memory on deviceside, shouldn't use bigbuff for zlib alloc
transfer as we go. compile , send, compile, send.

On client side I think its all ok, it collects all downloaded bytes and attempts to inflate it.

doegox commented 5 years ago

thanks, I'll be away next two weeks, not sure if I'll take the risk to brick it :D but I'll try to progress on the other issues

doegox commented 5 years ago

Sorry man, but bt_transfers doesn't contain your branch commits :( You can run gitk --all to have a graphical view of the situation and try to fix it.

iceman1001 commented 5 years ago

I will have a look at it tonight when I am home again

iceman1001 commented 5 years ago

The idea we discussed in the car,

Mixed mode: where taking a 512 buffer and finish compress, maybe fill up with 512-n bytes of uncompress to get some advantage. full compress mode: where every 512 chunk is compressed and sent maybe not fill full 512 bytes unpack as we go every time on client side.

iceman1001 commented 5 years ago

@slurdge ....

slurdge commented 5 years ago

I'm alive! So, regarding LF data, we can compress it at 2(3) locations:

When capturing data: instead of writing the data directly to the bigbuffer, we compress it (and keep the dictionary for further compression)
When transmitting the data, we compress it: either at the call site for the bigbuf transfer, or transparently at the comm level.

In my test, I did the transparent compression at the comm level. But I didn't realize that was the old packet format. It would be better to move to the new packet format, which would either include another magic preamble or a an enum flag for data filters (RLE, LZ4, etc.).

However:

LZ4 is still not in main, and putting zlib compression inside pm3 will be a burden IMO;
Both zlib&lz4 must be customized a little bit, however right now we include them "as-is". We would need at least a script for pull&patch IMO;
Optionally we get rid of zlib compression of hardnested_tables (but it can be done later, just it will have some larger makefiles)

Right now lz4 compression for fpga is working well. How would you like to implement comm compression ? If we compress at capture level, it takes same latency but the bandwidth is higher.

iceman1001 commented 5 years ago

Yay! he is alive!

zlib, I added it into the device side in the branch "bt_transfers" https://github.com/RfidResearchGroup/proxmark3/commit/ffff5574e31c004074e299827cddfa333110db41

Are we going for LZ4 or zlib? That is the question. It was faster with LZ4 on deviceside, if I remember your posts correct. I doubt we can compress at sampling time, since we need the correct data samples... hence easier to opt for after sampling is done or when offloading from device.

iceman1001 commented 4 years ago

maybe don't need l4c or zlib for host - device comms. A simple byte compression will do.
1byte token, 1byte len, = 2 bytes. 0x11, 0x20 = 32 bytes of 0x11 = that becomes two bytes. max 256 bytes. or 1byte token, 2bytes len = 3bytes. that will allow us to compress the whole 512-xxx byte range of 0xff, to three bytes. 0xFF, 0xFF,0x1

Don't need fancy compression lib to do bit-token levels with hufman codes.

slurdge commented 4 years ago

You mean just RLE ? It will not compress much... We can compress at sampling time and correct it afterwards (on host), or just after correction. I we compress it inline, we will yield the maximum buffer size for sending by bt or usb.

doegox commented 4 years ago

yeah, only very saturated signals will benefit from such basic strategy.

iceman1001 commented 4 years ago

Usually we use the nomenclature strong/clipped signals. This type of signals will benefit from compression. Especially ASK/NRZ types. Dunno about PSK/FSK. noicey signals will not benefit from simple algos, maybe from a compression lib.

Well, when we have compression we can start seeing its effects of different types of signals.

Uplaoding 1K/4K dump to device should also benefit.

iceman1001 commented 4 years ago

I just don't want this to be overcomplicated. This function is important since LF user experience over blueshark is aweful.

RfidResearchGroup / proxmark3

Compress raw LF data? #852