Open dgladkov opened 8 years ago
Great comparison, thanks!
I have yet another humble suggestion - is it possible to submit something like bitwise_xor_cyclic
to numpy, and then add here? That would be probably make it work even faster?
There's also this large battle for XOR at SO. I can't estimate how hard is it to include & maintain such code in our library and/or if it's worth that.
A lot of answers suggested in that SO post are impractical (writing manual SSE2 instructions, depending on lots of third party libraries), but the general idea is that the only way to gain meaningful performance improvement is to write native code. The best solutions for Python are cffi and Cython. cffi is a better solution for small functions, while Cython is better when you need complex C<->Python interop, so I picked cffi for my test. My current C implementation looks like this (I know only basic C so this might be suboptimal):
void xor_one(const char *data, const unsigned char key, char *result)
{
unsigned long data_len = strlen(data);
unsigned long idx;
for (idx = 0; idx < data_len; idx++)
{
result[idx] = data[idx] ^ key;
}
}
void xor_many(const char *data, const char *key, char *result)
{
unsigned long data_len = strlen(data);
unsigned long key_len = strlen(key);
unsigned long idx;
for (idx = 0; idx < data_len; idx++)
{
result[idx] = data[idx] ^ key[idx % key_len];
}
}
Benchmark results:
Python 2
== run_benchmark_process_xor.py
max = 2144703287
26.3223340511
0.44394993782
Python 3
== run_benchmark_process_xor.py
max = 2144703287
23.45923235600094
0.4029908089996752
Pros:
Cons:
Submitting bitwise_xor_cyclic
to numpy doesn't look realistic, their bitwise_xor is is just one of their supported array operations, so it requires arrays of the same size in any case.
In my opinion performance improvements of cffi code are not worth the build complexity it adds.
Adding Cython as either dependency or runtime is being debated in https://github.com/kaitai-io/kaitai_struct/issues/311 . If it gets added, reimplementing xor will be just a few-liner.
EDIT: Cython runtime was withdrawn.
Followup of discussion in 04289cc13d6501153cab61e875d9c4e2ccdaa5d8
Python 2 bytearray mutations
Python 3 bytearray mutations
Python 3 iterating bytes
Python 2 numpy bitwise_xor
Python 3 numpy bitwise_xor
Conclusions:
As numpy contains C extensions and may require compilation, I suggest adding numpy as an optional dependency and falling back to native Python bytearray mutations + xor if numpy is not installed.