Closed dbrito closed 5 months ago
Hi @dbrito
Just to clarify if you are reusing your reports everytime you batch? PRIVACY_BUDGET_EXHAUSTED error is caused by the shared ID of reports being batched more than once. Each report will be assigned a "shared ID" which will consist of the shared_info fields api, reporting_origin, destination_site, source_registration_time (truncated by day), scheduled_report_time (truncated by hour) and version. This will mean that multiple reports can belong to the same "shared ID" should they share the same attributes of the shared_info field.
Based on the disjoint batches, would you be able to ensure that there are no overlaps on the batches based on the Shared ID? Please ensure that source_registration_time (truncated by day) and scheduled_report_time (truncated by hour) is taken into consideration. We also have some guidelines on batching strategies that you can look into.
An example is if you have the following shared_info field (below), you can see that the API is the same (attribution-reporting), the attribution_destination is the same (https://privacy-sandcastle-dev-shop.web.app), the reporting_origin is the same (https://privacy-sandcastle-dev-dsp.web.app). The source_registration_time is the same (0). So, we only have scheduled_report_time which is different. But if we take a look at the scheduled_report_time, one is "Tuesday, January 2, 2024 5:19:12 PM" and the other report is "Tuesday, January 2, 2024 5:24:22 PM" if we truncate them by the hour, they are both "Tuesday, January 2, 2024 5 PM". Which means that both reports have one privacy budget. So you can have hundreds/thousands/more reports which is equivalent to 1 privacy budget. All of the reports with the same shared id will have to go in the same batch.
"shared_info": "{"api":"attribution-reporting","attribution_destination":"https://privacy-sandcastle-dev-shop.web.app\",\"debug_mode\":\"enabled\",\"report_id\":\"af0cfc09-18d3-4234-8d02-1e36a189a7c4\",\"reporting_origin\":\"https://privacy-sandcastle-dev-dsp.web.app\",\"scheduled_report_time\":\"1704215952\",\"source_registration_time\":\"0\",\"version\":\"0.1\"}",
"shared_info": "{"api":"attribution-reporting","attribution_destination":"https://privacy-sandcastle-dev-shop.web.app\",\"debug_mode\":\"enabled\",\"report_id\":\"1a1b25aa-5e1b-43fc-b80e-9cc9e8ce7658\",\"reporting_origin\":\"https://privacy-sandcastle-dev-dsp.web.app\",\"scheduled_report_time\":\"1704216262\",\"source_registration_time\":\"0\",\"version\":\"0.1\"}",
Hi @maybellineboon, I think I understand now!
During my tests, I certainly activated the AgregationService more than once within the minimum period of 1 hour, although each test had new reports, they were created at the same time and consequently maintained the same "shared Id", probably causing the error "PRIVACY_BUDGET_EXHAUSTED" , we will adjust the way we run our tests so that the AggregationService does not trigger more than once within the same hour.
Thank you so much for the help
*AggregationService triggered for the first time at the “new” hour
Hello team, how are you?
Guys, after uploading the AggregationService to the AWS environment, I was carrying out some tests to generate the summary report, with these tests I noticed that the noise was greatly impacting the values of the metrics, given this scenario I implemented scaling the value and defining the epislon, but after adding the epislon definition to /createJob, the AggregationService is always returning the PRIVACY_BUDGET_EXHAUSTED error, regardless of whether they are new reports (new .avro files).
I wanted to see if anyone had any tips on how to identify the source of this error and consequently how I could get around it.
CreateJob body Request
AggregationService getJob response
*I was taking a look at NoiseLab and the solution to mitigate the impact of noise would be to scale my values and use an epislon greater than 0
Reports used during testing:
Thanks in advance