ali1234 / vhs-teletext

Software to recover teletext data from VHS recordings.
GNU General Public License v3.0
179 stars 21 forks source link

fix, Profile, Vectorise, tune #79

Closed penguin42 closed 1 year ago

penguin42 commented 1 year ago

This is built on top of the other pull requests. There's two main performance improvements here, which get me from about 200lps to about 1000lps in the streams with a lot of 8 bit data.

First I change the split between the two min stages from 256 to 1024 for the big pattern cases; that gets a factor of 2 on the big cases. I only do it in the big cases because it makes the 8k case work. This is heuristic and may need tuning for other hardware based on OpenCL config data.

Secondly I vectorise all the kernels (and more than my first attempt); this is getting the promised 4x improvement vectorising should get when I look at the time taken to run a 65k line.

I also include the flags to turn on OpenCL profiling which is what let me figure some of this out.