gda-score / code

Tools for generating General Data Anonymity Scores (www.gda-score.org)
MIT License
7 stars 1 forks source link

Load K-anonymized tables into postgres #30

Closed yoid2000 closed 5 years ago

yoid2000 commented 5 years ago

I want to deploy the K-anonymized tables onto the postgres machine db001.gda-score.org. There will be four databases, one each for census, taxi, scihub, and banking, and in total 10 tables (7 in banking, one each in the other three). Since you are building two anonymized tables for each one (one with K=2 and one with K=5), there will be in total 20 tables.

Please name the four databases as:

k_anon_X_database

where X is either 2 or 5, and database is one of census, taxi, scihub, and banking.

Each table will have the same columns as it's raw equivalent, and will a row for each row in the raw equivalent. The values are set as follows:

  1. Any unmodified field goes into the table also unmodified.
  2. Any value that is all * symbols goes into the table as NULL
  3. Any value for a text column that has some characters followed by * characters goes into the table as only the initial non-* characters. In other words, the value M**** goes into the table as just M.

@ku294714 can help you with any questions about how to load the data into the postgres machine.

ku294714 commented 5 years ago

Okay sir. Will get in touch with Resha on this and try to find about the details and let you know.

On Tue, 20 Nov 2018 at 8:32 AM, Paul Francis notifications@github.com wrote:

I want to deploy the K-anonymized tables onto the postgres machine db001.gda-score.org. There will be four databases, one each for census, taxi, scihub, and banking, and in total 10 tables (7 in banking, one each in the other three). Since you are building two anonymized tables for each one (one with K=2 and one with K=5), there will be in total 20 tables.

Please name the four databases as:

k_anon_X_database

where X is either 2 or 5, and database is one of census, taxi, scihub, and banking.

Each table will have the same columns as it's raw equivalent, and will a row for each row in the raw equivalent. The values are set as follows:

  1. Any unmodified field goes into the table also unmodified.
  2. Any value that is all * symbols goes into the table as NULL
  3. Any value for a text column that has some characters followed by characters goes into the table as only the initial non- characters. In other words, the value M**** goes into the table as just M.

@ku294714 https://github.com/ku294714 can help you with any questions about how to load the data into the postgres machine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/30, or mute the thread https://github.com/notifications/unsubscribe-auth/APMSJvi_N6qSjikDUwbGbYhOtSq2_tqBks5uw7AFgaJpZM4YqiE_ .

ku294714 commented 5 years ago

Hello sir,

Last few days I have been not feeling well so could not get in touch with Resha.

Just wanted to inform you.

Regards, Ankit

On Tue, 20 Nov 2018 at 11:22 AM, ANKIT DIXIT ankitdixitiit@gmail.com wrote:

Okay sir. Will get in touch with Resha on this and try to find about the details and let you know.

On Tue, 20 Nov 2018 at 8:32 AM, Paul Francis notifications@github.com wrote:

I want to deploy the K-anonymized tables onto the postgres machine db001.gda-score.org. There will be four databases, one each for census, taxi, scihub, and banking, and in total 10 tables (7 in banking, one each in the other three). Since you are building two anonymized tables for each one (one with K=2 and one with K=5), there will be in total 20 tables.

Please name the four databases as:

k_anon_X_database

where X is either 2 or 5, and database is one of census, taxi, scihub, and banking.

Each table will have the same columns as it's raw equivalent, and will a row for each row in the raw equivalent. The values are set as follows:

  1. Any unmodified field goes into the table also unmodified.
  2. Any value that is all * symbols goes into the table as NULL
  3. Any value for a text column that has some characters followed by characters goes into the table as only the initial non- characters. In other words, the value M**** goes into the table as just M.

@ku294714 https://github.com/ku294714 can help you with any questions about how to load the data into the postgres machine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/30, or mute the thread https://github.com/notifications/unsubscribe-auth/APMSJvi_N6qSjikDUwbGbYhOtSq2_tqBks5uw7AFgaJpZM4YqiE_ .

yoid2000 commented 5 years ago

No problem. Probably anyway you can wait until Resha gets in touch with you.

Get better.

PF

On Thu, Nov 22, 2018 at 2:21 PM Ankit Dixit notifications@github.com wrote:

Hello sir,

Last few days I have been not feeling well so could not get in touch with Resha.

Just wanted to inform you.

Regards, Ankit

On Tue, 20 Nov 2018 at 11:22 AM, ANKIT DIXIT ankitdixitiit@gmail.com wrote:

Okay sir. Will get in touch with Resha on this and try to find about the details and let you know.

On Tue, 20 Nov 2018 at 8:32 AM, Paul Francis notifications@github.com wrote:

I want to deploy the K-anonymized tables onto the postgres machine db001.gda-score.org. There will be four databases, one each for census, taxi, scihub, and banking, and in total 10 tables (7 in banking, one each in the other three). Since you are building two anonymized tables for each one (one with K=2 and one with K=5), there will be in total 20 tables.

Please name the four databases as:

k_anon_X_database

where X is either 2 or 5, and database is one of census, taxi, scihub, and banking.

Each table will have the same columns as it's raw equivalent, and will a row for each row in the raw equivalent. The values are set as follows:

  1. Any unmodified field goes into the table also unmodified.
  2. Any value that is all * symbols goes into the table as NULL
  3. Any value for a text column that has some characters followed by characters goes into the table as only the initial non- characters. In other words, the value M**** goes into the table as just M.

@ku294714 https://github.com/ku294714 can help you with any questions about how to load the data into the postgres machine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/30, or mute the thread < https://github.com/notifications/unsubscribe-auth/APMSJvi_N6qSjikDUwbGbYhOtSq2_tqBks5uw7AFgaJpZM4YqiE_

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/30#issuecomment-441029055, or mute the thread https://github.com/notifications/unsubscribe-auth/ACD-qfJ1yIKQUbFJ8u4KO5I3N_ZarZa_ks5uxqTAgaJpZM4YqiE_ .

ku294714 commented 5 years ago

Hello Sir,

We have pushed the data for scihub data. Please check. Meanwhile we will try to push other databases also.

Regards, Ankit.

On 22-Nov-2018, at 5:08 PM, Paul Francis notifications@github.com wrote:

No problem. Probably anyway you can wait until Resha gets in touch with you.

Get better.

PF

On Thu, Nov 22, 2018 at 2:21 PM Ankit Dixit <notifications@github.com mailto:notifications@github.com> wrote:

Hello sir,

Last few days I have been not feeling well so could not get in touch with Resha.

Just wanted to inform you.

Regards, Ankit

On Tue, 20 Nov 2018 at 11:22 AM, ANKIT DIXIT <ankitdixitiit@gmail.com mailto:ankitdixitiit@gmail.com> wrote:

Okay sir. Will get in touch with Resha on this and try to find about the details and let you know.

On Tue, 20 Nov 2018 at 8:32 AM, Paul Francis <notifications@github.com mailto:notifications@github.com> wrote:

I want to deploy the K-anonymized tables onto the postgres machine db001.gda-score.org. There will be four databases, one each for census, taxi, scihub, and banking, and in total 10 tables (7 in banking, one each in the other three). Since you are building two anonymized tables for each one (one with K=2 and one with K=5), there will be in total 20 tables.

Please name the four databases as:

k_anon_X_database

where X is either 2 or 5, and database is one of census, taxi, scihub, and banking.

Each table will have the same columns as it's raw equivalent, and will a row for each row in the raw equivalent. The values are set as follows:

  1. Any unmodified field goes into the table also unmodified.
  2. Any value that is all * symbols goes into the table as NULL
  3. Any value for a text column that has some characters followed by characters goes into the table as only the initial non- characters. In other words, the value M**** goes into the table as just M.

@ku294714 <https://github.com/ku294714 https://github.com/ku294714> can help you with any questions about how to load the data into the postgres machine.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/gda-score/code/issues/30 https://github.com/gda-score/code/issues/30>, or mute the thread < https://github.com/notifications/unsubscribe-auth/APMSJvi_N6qSjikDUwbGbYhOtSq2_tqBks5uw7AFgaJpZM4YqiE_ https://github.com/notifications/unsubscribe-auth/APMSJvi_N6qSjikDUwbGbYhOtSq2_tqBks5uw7AFgaJpZM4YqiE_

.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <https://github.com/gda-score/code/issues/30#issuecomment-441029055 https://github.com/gda-score/code/issues/30#issuecomment-441029055>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ACD-qfJ1yIKQUbFJ8u4KO5I3N_ZarZa_ks5uxqTAgaJpZM4YqiE_ https://github.com/notifications/unsubscribe-auth/ACD-qfJ1yIKQUbFJ8u4KO5I3N_ZarZa_ks5uxqTAgaJpZM4YqiE_> .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/gda-score/code/issues/30#issuecomment-441073734, or mute the thread https://github.com/notifications/unsubscribe-auth/APMSJj08fDX5sFOrdYuLMBdwmnI1IukBks5uxswOgaJpZM4YqiE_.