Open kurtbrose opened 6 years ago
Hey @kurtbrose, thanks for all the suggestions! This was my first time and only time I was using Cython or writing a Python extension so I am absolutely sure there is lot to improve, and would love to hear any advice.
Let's go through each point:
Regarding PR -> yes that would be awesome! @mahmoud suggested very similar changes as you, so we could if we can somehow combine your knowledge/ideas :D, maybe the best way is you make a PR and then we also include him.
@kurtbrose how is this going, don't give up :D! It would be nice to have Python library improved, let me know if you need any help or got stuck on anything.
There seem to be some low hanging fruits in enhancing the cython/python API.
1- calling
encode()
to convert unicode to bytes;encode('utf-8')
is safer 2- acceptbytes
and pass them through 3- switch from malloc to PyMem_Malloc, so that the python VM can do a better job of utilizing allocation pools (http://cython.readthedocs.io/en/latest/src/tutorial/memory_allocation.html -- cython docs showing how and explaining this is preferred) -- maybe there is some reason this can't be done because the C++ code is using raw free on allocated memory?4)
Regarding this:
I think it can be simplified to this:
Cython can see the type of the field is char and will automatically convert bytes. You can do <char> as well to force the conversion. Again maybe you tried that and ran into some issue?
5- release the GIL
Since this can be quite slow (10ms+) it is worth releasing the GIL so that this code can be multi-threaded from python.