statisticsnorway / dapla-toolbelt-pseudo

Pseudonymization extensions for Dapla Toolbelt
MIT License
1 stars 0 forks source link

500 server error on "sid_fields" - Allow only filling "sid_fields", not demanding "fields" #157

Open aecorn opened 1 year ago

aecorn commented 1 year ago

I have dataset with a single column "fnr", this is a sid-field. dpp.pseudonymize(data, sid_fields=["fnr"]).json() Returns TypeError: pseudonymize() missing 1 required positional argument: 'fields'

Whenever I specify "sid_fields", I get a 500-server error. dpp.pseudonymize(data, fields=["fnr"], sid_fields=["fnr"])

HTTPError: 500 Server Error: Internal Server Error for url: http://dapla-pseudo-service.dapla.svc.cluster.local/pseudonymize/file

If I only specify "fnr" as a "field", I get the wrong encryption. The goal is to get FPE on the "fnr" column.

aecorn commented 1 year ago

image

aecorn commented 1 year ago

Same column in fields and sid_fields image

mmwinther commented 1 year ago

@aecorn it looks like there are two separate issues here.

  1. TypeError: pseudonymize() missing 1 required positional argument: 'fields' has been resolved in #126 but has not been released yet.
  2. HTTPError: 500 Server Error: Internal Server Error for url: http://dapla-pseudo-service.dapla.svc.cluster.local/pseudonymize/file is an issue we need to resolve. The real problem is the user interface in the python package, we're considering combining sid_fields and fields to be one argument.