Open qascade opened 1 year ago
In general, ZetaSQL allows you to join tables and apply DP on top of it. However, our sample binary execute_query
only takes one argument for a table defined in a CSV file. You can modify the source code in examples/zetasql/execute_query.cc
and define another table in C++ using zetasql::MakeTableFromCsvFile
and define email to be the user id using the SetAnonymizationInfo
method on the defined tables.
ZetaSQL is written in C++ and uses the C++ DP Lib.
Let's use this issue to collect if there is interest in this feature. Using join conditions for DP queries might be something that is interesting to try out, since those joins are not straight forward (they need to propagate the column that is used to identify a user for the DP aggregation).
I am trying to use the dp library to run SQL queries that inherently support DP. Section 4 of the DP SQL paper discusses aggregation with joins and compares it with previously built DP SQL engines. In general, joins, especially inner joins, are one of the most sought out queries to be run. I think we should have an example of that and how it affects the accuracy of the results.
I wanted to write a Zetasql query that joins two tables on a single private column for an ANON_COUNT() query. For example, if there are tables: table1 and table2, both with a common email column.
Is it possible to do this? If it is possible to do this using the Go library that would also be great.