statisticsnorway / dapla-toolbelt-pseudo

Pseudonymization extensions for Dapla Toolbelt
MIT License
1 stars 0 forks source link

Use generators to yield match results while traversing #392

Closed bjornandre closed 4 months ago

bjornandre commented 4 months ago

A further improvement would be to utilize async generators so that psedonymization can be executed as soon as a FieldMatch is emitted.

bjornandre commented 4 months ago

ThreadPoolExecutor, despite its confusing name, doesn't actually give us multi-threading because of the GIL, and is mostly useful for I/O bound tasks. Perhaps we should use concurrent.futures.ProcessPoolExecutor instead? As far as I remember it uses the same interface

Approving this in any case, could be nice with a performance comparison 👍

@mallport you are right. Tests show that ThreadPoolExecutor doesn't improve the efficiency at all. We might as well remove it. ProcessPoolExecutor wouldn't work either since the processes cannot work on a shared memory object, and would require a lot of de-/serialization.

sonarcloud[bot] commented 4 months ago

Quality Gate Passed Quality Gate passed

Issues
1 New issue
0 Accepted issues

Measures
0 Security Hotspots
58.5% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud