pion / mediadevices

Go implementation of the MediaDevices API.
https://pion.ly/
MIT License
535 stars 122 forks source link

Optimize ToI420 conversion by using sync.Pool #473

Closed neversi closed 1 year ago

neversi commented 1 year ago

Description

Added sync.Pool to the i420 conversion to minimize overhead of creating new byte slices

Reference issue

Workaround for https://github.com/pion/mediadevices/issues/424

TODO
edaniels commented 1 year ago

Hi @neversi, would you mind providing some benchmark results for this for previous/now? I'm also curious if there's room to expand this to other frame decoders?

neversi commented 1 year ago

Hi @edaniels, yes of course, I will provide as soon as possible!

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 95.34% and project coverage change: +0.37 :tada:

Comparison is base (d561715) 58.08% compared to head (51e1510) 58.45%.

:exclamation: Current head 51e1510 differs from pull request most recent head 07e2695. Consider uploading reports for the commit 07e2695 to get more accurate results

Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #473 +/- ## ========================================== + Coverage 58.08% 58.45% +0.37% ========================================== Files 62 62 Lines 3691 3724 +33 ========================================== + Hits 2144 2177 +33 Misses 1420 1420 Partials 127 127 ``` | [Impacted Files](https://codecov.io/gh/pion/mediadevices/pull/473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion) | Coverage Δ | | |---|---|---| | [pkg/io/video/convert.go](https://codecov.io/gh/pion/mediadevices/pull/473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion#diff-cGtnL2lvL3ZpZGVvL2NvbnZlcnQuZ28=) | `72.80% <92.59%> (+6.13%)` | :arrow_up: | | [pkg/codec/openh264/openh264.go](https://codecov.io/gh/pion/mediadevices/pull/473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion#diff-cGtnL2NvZGVjL29wZW5oMjY0L29wZW5oMjY0Lmdv) | `81.69% <100.00%> (+0.53%)` | :arrow_up: | | [pkg/codec/vpx/vpx.go](https://codecov.io/gh/pion/mediadevices/pull/473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion#diff-cGtnL2NvZGVjL3ZweC92cHguZ28=) | `83.59% <100.00%> (+0.08%)` | :arrow_up: | | [pkg/codec/x264/x264.go](https://codecov.io/gh/pion/mediadevices/pull/473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion#diff-cGtnL2NvZGVjL3gyNjQveDI2NC5nbw==) | `65.51% <100.00%> (+0.40%)` | :arrow_up: | | [pkg/io/video/convert\_cgo.go](https://codecov.io/gh/pion/mediadevices/pull/473?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion#diff-cGtnL2lvL3ZpZGVvL2NvbnZlcnRfY2dvLmdv) | `71.23% <100.00%> (+2.57%)` | :arrow_up: | Help us with your feedback. Take ten seconds to tell us [how you rate us](https://about.codecov.io/nps?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion). Have a feature suggestion? [Share it here.](https://app.codecov.io/gh/feedback/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=pion)

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

neversi commented 1 year ago

@edaniels Here is the result of Benchmarks, I have used benchstat for representation From the comparison it could be seen great reduction in B/ op -> which leads to less pressure on GC

GOMAXPROCS=8
goos: darwin
goarch: arm64
pkg: github.com/pion/mediadevices/pkg/io/video
                    │   old.txt   │              new.txt               │
                    │   sec/op    │   sec/op     vs base               │
ToI420/480p/I444-8    101.2µ ± 2%   100.7µ ± 3%       ~ (p=0.280 n=10)
ToI420/480p/I422-8    54.22µ ± 2%   51.07µ ± 2%  -5.81% (p=0.000 n=10)
ToI420/480p/I420-8    11.15n ± 2%   11.86n ± 4%  +6.37% (p=0.000 n=10)
ToI420/480p/RGBA-8    267.1µ ± 2%   273.6µ ± 2%  +2.46% (p=0.002 n=10)
ToI420/1080p/I444-8   614.2µ ± 8%   584.6µ ± 1%  -4.82% (p=0.000 n=10)
ToI420/1080p/I422-8   310.3µ ± 1%   296.1µ ± 0%  -4.58% (p=0.000 n=10)
ToI420/1080p/I420-8   11.15n ± 1%   11.81n ± 4%  +5.87% (p=0.000 n=10)
ToI420/1080p/RGBA-8   1.634m ± 7%   1.638m ± 2%       ~ (p=0.631 n=10)
geomean               22.09µ        22.05µ       -0.19%

                    │      old.txt      │                 new.txt                  │
                    │       B/op        │      B/op       vs base                  │
ToI420/480p/I444-8     180224.00 ± 0%       63.50 ±  17%  -99.96% (p=0.000 n=10)
ToI420/480p/I422-8     180224.00 ± 0%       63.00 ±  11%  -99.97% (p=0.000 n=10)
ToI420/480p/I420-8         0.000 ± 0%       0.000 ±   0%        ~ (p=1.000 n=10) ¹
ToI420/480p/RGBA-8      180456.0 ± 0%       327.0 ±  11%  -99.82% (p=0.000 n=10)
ToI420/1080p/I444-8   1048577.00 ± 0%       56.00 ± 914%  -99.99% (p=0.000 n=10)
ToI420/1080p/I422-8    1048577.0 ± 0%       182.0 ±  70%  -99.98% (p=0.000 n=10)
ToI420/1080p/I420-8        0.000 ± 0%       0.000 ±   0%        ~ (p=1.000 n=10) ¹
ToI420/1080p/RGBA-8   1032.272Ki ± 0%     9.606Ki ±  12%  -99.07% (p=0.000 n=10)
geomean                               ²                   -99.66%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                    │   old.txt    │               new.txt               │
                    │  allocs/op   │ allocs/op   vs base                 │
ToI420/480p/I444-8    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/480p/I422-8    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/480p/I420-8    0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/480p/RGBA-8    2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/I444-8   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/I422-8   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/I420-8   0.000 ± 0%     0.000 ± 0%       ~ (p=1.000 n=10) ¹
ToI420/1080p/RGBA-8   2.000 ± 0%     2.000 ± 0%       ~ (p=1.000 n=10) ¹
geomean                          ²               +0.00%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean
neversi commented 1 year ago

What do you mean by "commit the benchmarks in"? In description of PR?

edaniels commented 1 year ago

Oh never mind hah. I thought you added new benchmarks. I see you resued.

neversi commented 1 year ago

@edaniels, regarding to expanding pool to other frame decoders, I think it is possible, I will research it on weekends.

edaniels commented 1 year ago

Sounds great! Thank you for the contribution