private-attribution / ipa

A raw implementation of Interoperable Private Attribution
MIT License
40 stars 23 forks source link

Hand off Tokio bookkeeping tasks before starting a new IPA query #1123

Closed akoshelev closed 2 months ago

akoshelev commented 2 months ago

This is an attempt to mitigate #1120 by hinting Tokio executor that query task maybe blocking the worker thread for longer than expected and it should hand off everything from that worker thread.

Here is how this issue is reproducible in our environment

Tokio executor is a work-stealing scheduler that optimizes for low latency and resource utilization. It optimizes for low scheduling overhead and uses the same worker threads for polling tasks and to service IO and timer operations, if they are enabled. This leads to very low overhead, but also to an unfortunate possibility of tasks that are long-running to block the entire runtime, even if it is multithreaded. Here is an example in IPA code that led to this discovery:

akoshelev commented 2 months ago

Draft run for 50M records (was timing out before): https://draft-mpc.vercel.app/query/view/paved-stork-5287

codecov[bot] commented 2 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 91.50%. Comparing base (195ba00) to head (4215af5). Report is 14 commits behind head on main.

:exclamation: Current head 4215af5 differs from pull request most recent head 76a2b27

Please upload reports for the commit 76a2b27 to get more accurate results.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1123 +/- ## ========================================== + Coverage 91.38% 91.50% +0.11% ========================================== Files 188 189 +1 Lines 27066 27335 +269 ========================================== + Hits 24735 25012 +277 + Misses 2331 2323 -8 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

akoshelev commented 2 months ago

it does not hang anymore, fails because of #1100

akoshelev commented 2 months ago

I decided not to create an executor module for now - block_in_place is a specialized API and maybe we use some other runtime sooner than we run into another issue. If block_in_place needs to be used somewhere else, then a new module will be justified