IABTechLab / iabgpp-java

Apache License 2.0
11 stars 13 forks source link

High CPU Consumption #25

Open ym-corey opened 1 year ago

ym-corey commented 1 year ago

Hi @chuff (maybe)?

We've started using the GPP library in our live environment to get a better idea of how it performs etc., and to ensure it works. While we're not receiving many GPP strings, we are receiving TcfEuV2/TCF2/GDPR strings in large volume. On our servers, we've noticed that CPU consumption for handling a TcfEuV2 (GDPR IAB TCF V2) string processing through the TcfEuV2 model eats up around 8% CPU, while the same strings processing through the older IAB TCF TCString library only consumes around 0.8% CPU.

Looking at the TcfEuV2 model, I see that the entire string is decoded upon creation rather than lazy decoding only the segments that are relevant/used. Would it be feasible to update the decryption/parsing/processing methodology to be lazy?

For performance stats, I used this tool

travisbeale commented 1 year ago

We are having the same problem. This library creates an additional 15-20% CPU usage in our ad server.

patmmccann commented 5 months ago

Was this solved by #34 ?

ym-corey commented 5 months ago

Hi @patmmccann ,

I ran some more tests today. Since we aren't receiving GPP in high volume I substituted the usage of the TCString library with TcfEuV2 on GDPR strings to simulate what the CPU consumption/distribution within our Java process would be if we instead needed to process GPP at the same volume.

Our control traffic shows around 4% CPU (TCString.decode(consentString)), while the test shows around 18% (new TcfEuV2(consentString)). This is on a service where around 80% of inbound traffic requires processing of a consent string.

My understanding is that the TCString library is lazily decoding each section of GDPR, whereas the GPP library is lazily decoding each section of GPP. The key difference being that the GDPR string is not lazily decoded in the GPP library.

patmmccann commented 5 months ago

@ym-corey I recommend you reach out to @net-burst ; who made substantial contributions to make prebid-server-java performant at decoding gpp strings. I am sure he is much more knowledgable than I on this matter

ym-corey commented 5 months ago

I've reviewed the prebid-server-java code and while there is some code in place to prevent re-encoding of strings (specifically the TcfEuV2 and UsPrivacyV1 sections) for outbound requests it otherwise doesn't resolve or work around the lazy decoding issue present in the GppModel. I suspect prebid-server-java will also notice a large cpu spike when CMPs switch from providing GDPR strings to GPP strings.

Net-burst commented 5 months ago

Hey, @ym-corey. Our team made a decision to wait for IAB fix/workaround and work from there instead of jumping the gun. We will re-evaluate the bottlenecks after we update the library and will open-source our work by making a PR here. And yeah, we are already seeing an increase in CPU usage, so this will become a priority...

yuzawa-san commented 2 months ago

Please review https://github.com/IABTechLab/iabgpp-java/pull/56 I was able to cut down on CPU and memory significantly.