animetosho / node-yencode

SIMD accelerated yEnc encoder/decoder and CRC32 calculator for node.js
37 stars 5 forks source link

yEnc SSE decode ideas #4

Closed Safihre closed 7 years ago

Safihre commented 7 years ago

I was amazed to find this repo, I have been thinking of some way to do yEnc-decoding (as a Python-C-extension) using SSE instructions but my knowledge of C is just too rudimentary for now.

Do you think think SSE can help compared to regular char-by-char decoding of yEnc body? How would you go about the decoding-escaping problem? I can imagine finding the escape chars, but how to remove them later on when building the output string? I tried to grasp your encoding-code, but I think I probably miss the main idea due to the included edge-cases and optimizations.

Thanks!

EDIT: I think I am getting more and more of the code and how you handle the encoding-escaping here: https://github.com/animetosho/node-yencode/blob/master/yencode.cc#L718-L752 I don't completly understand the shuffle operations just yet and how they handle the extra chars, what are shufMixLUT and shufLUT?

hugbug commented 7 years ago

Thanks so much! I'll integrate the new version and report back.


In the meantime I've done more tests, in particular on Dell 2015 notebook when running Linux. The numbers are crazy high (MB/s):

Improvement MacBook
macOS
i5-520
Dell
Windows
i7‑5600U
Dell
Linux
i7‑5600U
PVR
Linux
ARMv7
NEO2
Linux
ARMv8
improved decoder, scalar crc 305 389 480 89 102
raw decoder, scalar crc 369 414 636 93 107
simd decoder, scalar crc 467 493 836 99 121
simd decoder, simd crc 520 541 1011 n/a 136

For description of devices, test conditions and more results (not related to SIMD) please see original post.

hugbug commented 7 years ago

Results for one-pass simd decoder with end-of-stream detection (simd-end):

Improvement MacBook
macOS
i5-520
Dell
Windows
i7‑5600U
Dell
Linux
i7‑5600U
PVR
Linux
ARMv7
NEO2
Linux
ARMv8
simd decoder 520 541 1011 99 136
simd-end decoder 520 570 1140 106 157
Safihre commented 7 years ago

Cool to see how within 1 month from creating this issue it has now a working implementation in NZBget. So I would say this issue served its purpose and in case I have specific implementation questions for SABnzbd I will open another topic! Thanks all!

animetosho commented 7 years ago

It has been interesting - thanks for creating the topic!

Are you planning to migrate to Python 3 before using this decoder? I imagine that SABYenc could be changed to use it, as is, but I'd imagine that Python 3's API would be different - if that's the goal.