GSA / enterprise-data-inventory

The Enterprise Data Inventory is a CKAN based data management system for private and public data management
7 stars 5 forks source link

Support redactions for each field #182

Closed philipashlock closed 8 years ago

philipashlock commented 8 years ago

Almost any field could be redacted. The only ones that would never be redacted are: Public Access Level, Rights, and Identifier. This should be presented as a select dropdown with the four exemption reasons as options: https://project-open-data.cio.gov/redactions/

This will make the UI pretty cluttered if we display a dropdown for each field, so maybe these would be hidden for each field and require you to click a link or icon next to each field to display the dropdown.

Redactions should not be allowed for Public Access Level datasets

We should also change the labels for the export buttons to:

philipashlock commented 8 years ago

Since redactions should be kept to a minimum rather than redacting the entire field, we'll also need a syntax for redacting only selected words in a field. Let's propose to do this with opening and closing tags. An example of an opening tag with the B3 Exemption reason would be [[REDACT-EX B3]] and the closing tag would be [[/REDACT]]

When this is exported as a redacted record these opening and closing tags as well as everything between them would be replaced with [[REDACTED-EX B3]] (with B3 or whatever reason matching what was specified in the REDACT tag)

Note that this will require some minor updates to the JSON Schema validation (the regular expressions will need to allow just REDACT rather than REDACTED)

vasili4 commented 8 years ago

As a first phase,

1) We add [[redacted]] icon near every field on the dataset edit form @inventory using usmetadata extension. These icons will not appear if dataset Public Access Level is set to "Public".

screenshot 2015-08-17 17 20 50

2) on clicking [[redacted]] icon, a new <select> dropdown will appear under the current field, so publisher will be asked to choose a reason of redaction.

screenshot 2015-08-17 17 28 05 screenshot 2015-08-17 17 28 12

3) If one of the reasons is selected, then on the datajson export step, for [REDACTED] (Public version, or for CKAN import), only reason will be exported, replacing whole input value, ex. [[REDACTED-EX B4]]

screenshot 2015-08-17 17 31 06

4) For [NON-REDACTED] type of export (internal for Publishers only), all the fields will stay raw, without any [[REDACTED]] notes (just as we do now)

5) We will store the reason of redactions in extras, where extras['key'] = 'redacted_' + field['name']. In this case we won't need any database structure changes. And users will be able to add or edit these extras using API.

philipashlock commented 8 years ago

This approach looks good to me. The only feedback I have is on the select dropdown box. Let's have the labels read as:

vasili4 commented 8 years ago

that's how it's now. Hope I got it right

screenshot 2015-08-20 12 35 00 screenshot 2015-08-20 12 35 08 screenshot 2015-08-20 12 35 15
vasili4 commented 8 years ago

alt/title will show full description

screenshot 2015-08-20 12 46 56
ykhadilkar commented 8 years ago

@alex-perfilov-reisys Will it be better if we move "R" to the end?

vasili4 commented 8 years ago

well I don't quite like this idea

screenshot 2015-08-20 14 56 34 screenshot 2015-08-20 15 08 44
kvuppala commented 8 years ago

I like it on the left, because for all fields it will be at a constant position

ykhadilkar commented 8 years ago

@kvuppala good point. Only concern was left side is getting little bit crowded. Also does "R" provide enough context that its a button for redacting a field?

kvuppala commented 8 years ago

@philipashlock @ykhadilkar-rei @alex-perfilov-reisys Yes, the icon there is just a wireframe. How about we use this as the permanent icon to represent redactions

https://thenounproject.com/search/?q=REDACT&i=28201 redact

vasili4 commented 8 years ago
screenshot 2015-08-21 16 43 48
kvuppala commented 8 years ago

@alex-perfilov-reisys icon small version seems not that clear and indicates redaction.

We could try out these two other options: https://thenounproject.com/search/?q=REDACT&i=30486 https://thenounproject.com/search/?q=REDACT&i=66406

vasili4 commented 8 years ago

since we're using fontawesome, I applied icon-eyes (open/closed). Looks good to me

screenshot 2015-08-21 19 41 45
philipashlock commented 8 years ago

Let's just go with the "R" on the left side as the button for now. The only other thing is to change the default text in the select dropdown to "Select FOIA Exemption Reason for Redaction" and don't include the "==" marks.

kvuppala commented 8 years ago

Sure Thanks @philipashlock

kvuppala commented 8 years ago

The redactions feature is now in production, on the main dataset metadata page UI options are available for Restricted Public, Non Public datasets, users would need to select the R icon to show the dropdown on the field and select the redaction reason.

Also note that, EDI and PDL buttons are renamed to "Unredacted Inventory" and "Redacted Inventory" respectively

The key change on these buttons export data.json is that, now both include all the datasets including the "non-public" access level datasets, so the agencies need to redact the fields that are applicable using the UI options provided and generated the redacted json before publishing it on the agency.gov/data.json file

Currently the UI options are not available on the Data Resources page, which is being worked to deploy in production soon. As a workaround the redacted keyword as mentioned in guidance can be entered in the field and exported. But for the unredacted version of json, user would need to manually edit the file to include the unredacted URL or values before submitting to OMB

inventory_redactions_dropdown inventory_redactions_add_dataset inventory_redactions_buttons

kvuppala commented 8 years ago

Pending items on this feature are created as new issues #184 and #185