Open yoid2000 opened 5 years ago
@rbh-93 please start on this issue.
Be sure to create a new branch for this. Don't write to master.
Hello,
I have been understanding the workflow in the gdaScore class but you mentioned:
Note that the type of interface is specified in the common/config/master.json config file, as "type". Here is where a database is configured as uber_dp.
Am I supposed to create a new "type" or should I use the existing "postgres" type?
If you don't make it a new type, how would gdaScore know to query uber?
PF
On Tue, May 28, 2019 at 4:53 PM Rohan notifications@github.com wrote:
Hello, I have been understanding the workflow in the gdaScore class but you mentioned: Note that the type of interface is specified in the common/config/master.json config file, as "type". Here is where a database is configured as uber_dp. Am I supposed to create a new "type" or should I use the existing "postgres" type?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/33?email_source=notifications&email_token=AAQP5KIT7ENDVUBXM2JALKLPXVBM5A5CNFSM4GSVMULKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMMMOI#issuecomment-496551481, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQP5KKY5WKITX7XDDRO32TPXVBM5ANCNFSM4GSVMULA .
Another question is that will the _dbWorker()
send the parameters (query, epsilon, budget) to the Python simpleServer which will then write the query to a file and the UberTool will read from the file and send back the result? This sending of parameters to the simpleServer.py will be done in the following part of gdaScore
:
# Establish connection to database
connStr = str(f"host={d['host']} port={d['port']} dbname={d['dbname']} user={d['user']} password={d['password']}")
if self._vb: print(f" {me}: Connect to DB with DSN '{connStr}'")
conn = psycopg2.connect(connStr)
cur = conn.cursor()
Is that correct?
RIght now the UberTool connects to the database like this:
val con_str = "jdbc:postgresql://db001.gda-score.org:5432/" + dbName + "?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory&user=<username>&password=<password>"
So the UberToo is connecting to db001.gda-score.org:5432.
yes, with the caveat that there'd be a third condition (i.e. if aircloak ... elif postgres ... elif uber ....)
PF
On Tue, May 28, 2019 at 5:07 PM Rohan notifications@github.com wrote:
Another question is that will the _dbWorker() send the parameters (query, epsilon, budget) to the Python simpleServer which will then write the query to a file and the UberTool will read from the file and send back the result? This sending of parameters to the simpleServer.py will be done in the following part of gdaScore:
Establish connection to database
connStr = str(f"host={d['host']} port={d['port']} dbname={d['dbname']} user={d['user']} password={d['password']}") if self._vb: print(f" {me}: Connect to DB with DSN '{connStr}'") conn = psycopg2.connect(connStr) cur = conn.cursor()
Is that correct?
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/33?email_source=notifications&email_token=AAQP5KP6ZVRLN7G5M7GMNMLPXVDCLA5CNFSM4GSVMULKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMN5JQ#issuecomment-496557734, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQP5KN2QUJYHPQ5EHD5VKTPXVDCLANCNFSM4GSVMULA .
Why do we care how the uber tool connects to the database?
PF
On Tue, May 28, 2019 at 5:12 PM Rohan notifications@github.com wrote:
RIght now the UberTool connects to the database like this: val con_str = "jdbc:postgresql://db001.gda-score.org:5432/" + dbName + "?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory&user=
&password= " So it is connecting to the postgres db on db001.gda-score.org:5432. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/33?email_source=notifications&email_token=AAQP5KMJSGQZPVLUPXBXYN3PXVDULA5CNFSM4GSVMULKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMONKI#issuecomment-496559785, or mute the thread https://github.com/notifications/unsubscribe-auth/AAQP5KPAR7LHHJQIFQANQDLPXVDULANCNFSM4GSVMULA .
Hi @yoid2000 , I am currently working on the uber_interface branch. (https://github.com/gda-score/code/tree/uber_interface). I pushed my current working state even though it is not working. I am currently facing two issues when trying to test and thereby make it work and could benefit from your input.
1) Could you communicate me the address of the server where the uber_dp is running? 2) Could you give me a hint on what I need to run in order to test my changes in the project? I cannot figure out how I would initialize a process in the code that would run the gdaAttack.
Thank you for your help.
Could you communicate me the address of the server where the uber_dp is running?
In this directory:
https://github.com/gda-score/anonymization-mechanisms/tree/master/uber/examples
you can find a file config.py that contains the URL of the uber DP service. It is:
https://db001.gda-score.org/ubertool
Could you give me a hint on what I need to run in order to test my changes in the project? I cannot figure out how I would initialize a process in the code that would run the gdaAttack.
This is unfortunately rather complex.
This file:
https://github.com/gda-score/code/blob/master/gdascore/global_config/master.json
is a kind of master configuration. It contains all of the services, databases, and anonymization types.
Up to now, all anonymization types could be reached through 2 services, postgres
and aircloak
(https://github.com/gda-score/code/blob/9f4b2d0b600f546baf62874ea75b20d4461977cd/gdascore/global_config/master.json#L2-L13)
You need to add a new service, which could be called uber_dp
or something like that. The master config would be updated with the new service and in other places where we link the anonymization scheme with the service, etc. I could help you with that.
Then, when you want to run a test, you could do something like you find here:
https://github.com/gda-score/attacks/blob/master/examples/testSinglingOut.py
In that example, you can find a config structure that tells the code what to get from the master config to run the attack (which ultimately generates queries to the service). The config is here:
https://github.com/gda-score/attacks/blob/master/examples/testSinglingOut.py#L25-L33
Once the Uber server is working on db001, I'd like to add an interface to the Uber server to the class
gdaAttack()
which can be found incommon/gdaScore.py
.Currently there are two interfaces,
postgres
andaircloak
. I want to add a third, calleduber_dp
. This will result in changes deep withingdaAttack()
, upon which pretty much everything runs, so we need to be very careful here and validate that all of the examples incommon/examples
andattacks/examples
work after the changes.Note that the type of interface is specified in the
common/config/master.json
config file, as "type". Here is where a database is configured asuber_dp
.When
gdaAttack()
is called, it is handed a dict containing various parameters. An example of the config file for these parameters is for instanceattacks/dumbList_Infer.py.json
. Foruber_dp
, we need to add two additional parameters, "budget" and "epsilon".The tricky part will be establishing the connection itself and making the queries. This all happens in a method called
_dbWorker()
, which runs as a thread._dbWorker()
calls_processQuery()
, which is the thing that makes the query and returns the answer._dbWorker()
sets up its connection with this code:You'll need to add a ifelse here like
if d['type'] == 'uber_dp': ..... else:
and put your connection setup there. (Note that bothaircloak
andpostgres
use the same underlying interface, which is why there isn't an ifelse currently.)Note also that there is an interface to a cache sqlite database, with handles
connInsert
,curInsert
,connRead
,curRead
. These will continue to run as is, so the new interface doesn't affect that.In
_processQuery()
, the following code executes the query:You will need to add an ifelse to do the
uber_dp
query instead. Note in particular that thecur.fetchall()
call returns a data structure that is a list of lists like this:where
a
,b
,c
, etc. are the columns returned by the query, and 1, 2, ..., N are the rows returned by the query. Your interface much replicate this structure. When the uber server returns an error, or returns an out-of-budget message, this will be encoded in the error type (i.e.reply = dict(error=e.pgerror)
).If this is done right, then all the code running above this should work unchanged.
Please regard this particular issue as a kind of master issue. You should make specific smaller issues that we can test one at a time as you go. Each of the smaller issues will have an associated push, where the example functions are all verified as running.
Note that there are a number of helper methods that currently work with both
postgres
andaircloak
interfaces. These includegetColNamesAndTypes()
andgetTableNames()
. I don't expect these to work withuber_dp
, so you don't need to worry about that.As always, let me know if you have questions.