AlfredoSequeida / fvid

fvid is a project that aims to encode any file as a video using 1-bit color images to survive compression algorithms for data retrieval.
MIT License
354 stars 43 forks source link

Proposal: Cython Support #16

Closed Theelx closed 4 years ago

Theelx commented 4 years ago

I believe the possible optimizations (without a refactoring allowing for processing multiple pixels at a time) for the lengthy decode process are severely diminished now, and it's only twice the speed that it originally was. So, I am proposing that I make an optional Cython extension. Cython is a superset of Python that compiles to C, which compiles to machine code. If you want to take a look at what it looks like and how fast it is, check out the pybase122.pyx file in my pybase122 decoder in my starred repos. I included benchmarks for the best Python version I could write, and the best Cython version I could write (with some help). You'll notice that the Cython version I wrote ended up being at least 10x faster than the python version, and the version that I got help with is over 20x faster. This is a huge speedup. While this won't have as major a speedup because it uses less math/bit operations, I bet you that I can get at least a 4x speedup for the section inside the two for loops in the get_file_from_image (I think that's the name) function.

The main obstacle to including Cython is that it needs to be compiled with every change, and you need to pip install Cython to compile. However, luckily, it is possible to fall back to a Python version of the code should a user not want to install Cython or compile it. This is why I'm proposing that I only Cythonize the slow part inside the loop, and add a check for if the user's system supports Cython before it actually tries running the Cython code.

For an example of a Cython library, google "pomegranate Cython". Pomegranate is a machine learning library that is blazingly fast because it uses Cython to compile its algorithms. It's also open-source, on GitHub, so you can check the internals.

I'm marking this as a proposal because I don't currently want to do the work if you're not comfortable with it.

Theelx commented 4 years ago

Update: I did the work because I have extra time, and I made a Cython version that decodes about 5-6x as fast as the current Python. I'll make a PR if you say you'll consider it.

AlfredoSequeida commented 4 years ago

@Theelgirl Absolutely! Make the PR and I will take a look at it!

Theelx commented 4 years ago

Ok, making now. Edit: Made PR, it's ready for compatibility testing. I haven't tested if the pip install properly compiles, only a custom compiler, so lets hope it does.

Theelx commented 4 years ago

It benchmarks at about 4x faster on the Lenna image than Python, however fvid fails to decode the resulting mp4 with either version, producing an unusable file.bin Is this a known bug? I did a dumb, the bug is the only other open issue lmao

AlfredoSequeida commented 4 years ago

@Theelgirl So to clarify, did decoding work?

Theelx commented 4 years ago

@AlfredoSequeida Yes, using dobro's version fixed it. The problem wasn't with Cython, as it happened on the normal version also, it was with something else.

Theelx commented 4 years ago

Closing because the PR has been ready for a while.