Open animetosho opened 6 years ago
Thank you for showing interest in this project. I didn't expect anyone else to have any interest in such an obscure thing as a yEnc implementation in pure assembly.
I have tried using PSHUFB
but the complexity was too much for me. I'm just a newbie hobbyist and this project is already far above my skill level. I'm afraid you can't expect much from me. While this project isn't dead, progress is and will continue to be very slow.
Thanks for the response. Yeah, yEnc is a niche encoding, and the fact that it's been around for so long and no-one else seems to have attempted it suggests not much interest. The naiive C implementation isn't too bad either, but interestingly enough, there has been a little more interest in it more recently.
NZBGet currently uses the SIMD decoder I wrote up, and Sabnzbd's developers seem to be interested. This seems to have come from a push to optimize the downloaders for those on faster connections using weaker CPUs, and it seems to have made an impact. So I think it does have its uses, but perhaps instead of a lack of interest, perhaps it was never really considered before?
Implementing yEnc in SIMD is nonetheless an interesting challenge (at least I found it so), so even if it doesn't prove that useful, hopefully you've gained something yourself.
Although you seem to be quite capable yourself, feel free to take any ideas from my implementation if you think it'd help. The code isn't really great, so am happy to explain anything if you need it.
I think anyone who attempts a SIMD yEnc en/decoder definitely at least knows a fair bit - it's not a trivial undertaking after all. As such, I wouldn't be surprised if you come up with interesting ideas; for one, I had never considered using PEXT
to compress bytes, for example.
I happened to stumble across this project, and found it particularly interesting. This is because I've implemented my own SIMD yEnc en/decoder, thinking that no-one had tried doing so before, hence it's interesting to see someone else's attempt at the problem.
I'm not sure if you're still interested in improving this, but if so, I'm interested to see where you end up with this. My own implementation is written in C/C++ using intrinsics, and only operates using 128-bit SIMD, so I'm interested to see if there are potential speedups that can be obtained via a hand-crafted assembly implementation, possibly using wider vectors.
If it's of any help, my implementation can be found here, or more specifically, encoder and decoder.
I'm not sure if you've considered it, but I've found the
PSHUFB
instruction useful for getting the escaped bytes into the correct position. Here's a rough diagram of how my yEnc encoder uses it to shift bytes appropriately. Pointing this out in case you haven't considered it, as it may be worthwhile exploring.Thanks for the project!