Closed yumetodo closed 7 years ago
Sorry, nope. The change is too big for such marginal improvement. If there are 10ms difference on 720P, then I'll consider.
Also you need to turn off the /Qpar option in the project file, otherwise the results cannot be compared.
show me the new benchmark plz
@MaverickTse prease wait. I'm working #10 to investigate this.
I'd suggest against working on this further more. I have checked on my computer that /Qpar works correctly(using 12 out of my 13 cpu cores) and now I use a simple float[4] array on stack which posed no problem on row-wise parallelization. The use of std::thread really makes the code a lot more complex and I have no confident I can maintain that.
before[ns] | after[ns] | |
---|---|---|
max | 58851592 | 122142375 |
min | 15095044 | 17856938 |
avg. | 40697668.4706525 | 39822508.8662606 |
count | 3663 | 3776 |
stdev | 5228744.19346652 | 8064316.19743504 |
se | 86393.0757542369 | 131235.567943636 |
95%CL | 169327.316991945 | 257216.986660185 |
C.L.max | 40866995.7876444 | 40079725.8529208 |
C.L.min | 40528341.1536605 | 39565291.8796004 |
very small speed-up...
The average is only 1ms difference while stdev is a LOT wider... so nope. I have also tried the Concurrency library available in VS2015 (ppl.h), also to no avail. The /Qpar is still the most stable and fastest.
If you want to continue to play with the multi-threading issue, try to parallelize the LUT lookup. So far I failed to do this and time of processing increase with number of channels being processed.
Note that I'm not going to merge unless there are enough improvement in speed.
やっとテーブルの動的メモリー確保を削れた。 テストコードより、もとの計算結果と100%一致することを確認。
これでやっとAvUtlでの実地試験に望める。
Ut Video Codec SuiteなAVI素材だとなんかバグっているけど とりあえず1000 micro sec.~3000 micro sec.にまで高速化できたので成果はあった。
このバッグを治ったらmerge する
とりあえず範囲外についても元と同じ挙動にした。これよりAviUtl上でのテストに入る
追記:commit messageミスってたのでforce pushしました。 https://github.com/MaverickTse/SigContrastFastAviUtl/pull/9/commits/4b5b23c2d77e45640e2e91fe50586b6ad2853c88 が消えてるのはそのためです
治ったっぽい・・?
use
std::thread
.benchmark