Closed philipashlock closed 9 years ago
Since redactions should be kept to a minimum rather than redacting the entire field, we'll also need a syntax for redacting only selected words in a field. Let's propose to do this with opening and closing tags. An example of an opening tag with the B3 Exemption reason would be [[REDACT-EX B3]]
and the closing tag would be [[/REDACT]]
When this is exported as a redacted record these opening and closing tags as well as everything between them would be replaced with [[REDACTED-EX B3]]
(with B3
or whatever reason matching what was specified in the REDACT
tag)
Note that this will require some minor updates to the JSON Schema validation (the regular expressions will need to allow just REDACT
rather than REDACTED
)
As a first phase,
1) We add [[redacted]]
icon near every field on the dataset edit form @inventory using usmetadata extension
.
These icons will not appear if dataset Public Access Level
is set to "Public
".
2) on clicking [[redacted]]
icon, a new <select>
dropdown will appear under the current field, so publisher will be asked to choose a reason of redaction.
3) If one of the reasons is selected, then on the datajson export step, for [REDACTED]
(Public version, or for CKAN import), only reason will be exported, replacing whole input value, ex. [[REDACTED-EX B4]]
4) For [NON-REDACTED]
type of export (internal for Publishers only), all the fields will stay raw, without any [[REDACTED]]
notes (just as we do now)
5) We will store the reason of redactions in extras, where extras['key'] = 'redacted_' + field['name']. In this case we won't need any database structure changes. And users will be able to add or edit these extras using API.
This approach looks good to me. The only feedback I have is on the select dropdown box. Let's have the labels read as:
that's how it's now. Hope I got it right
alt/title will show full description
@alex-perfilov-reisys Will it be better if we move "R" to the end?
well I don't quite like this idea
I like it on the left, because for all fields it will be at a constant position
@kvuppala good point. Only concern was left side is getting little bit crowded. Also does "R" provide enough context that its a button for redacting a field?
@philipashlock @ykhadilkar-rei @alex-perfilov-reisys Yes, the icon there is just a wireframe. How about we use this as the permanent icon to represent redactions
@alex-perfilov-reisys icon small version seems not that clear and indicates redaction.
We could try out these two other options: https://thenounproject.com/search/?q=REDACT&i=30486 https://thenounproject.com/search/?q=REDACT&i=66406
since we're using fontawesome, I applied icon-eyes (open/closed). Looks good to me
Let's just go with the "R" on the left side as the button for now. The only other thing is to change the default text in the select dropdown to "Select FOIA Exemption Reason for Redaction" and don't include the "==" marks.
Sure Thanks @philipashlock
The redactions feature is now in production, on the main dataset metadata page UI options are available for Restricted Public, Non Public datasets, users would need to select the R icon to show the dropdown on the field and select the redaction reason.
Also note that, EDI and PDL buttons are renamed to "Unredacted Inventory" and "Redacted Inventory" respectively
The key change on these buttons export data.json is that, now both include all the datasets including the "non-public" access level datasets, so the agencies need to redact the fields that are applicable using the UI options provided and generated the redacted json before publishing it on the agency.gov/data.json file
Currently the UI options are not available on the Data Resources page, which is being worked to deploy in production soon. As a workaround the redacted keyword as mentioned in guidance can be entered in the field and exported. But for the unredacted version of json, user would need to manually edit the file to include the unredacted URL or values before submitting to OMB
Pending items on this feature are created as new issues #184 and #185
Almost any field could be redacted. The only ones that would never be redacted are: Public Access Level, Rights, and Identifier. This should be presented as a select dropdown with the four exemption reasons as options: https://project-open-data.cio.gov/redactions/
This will make the UI pretty cluttered if we display a dropdown for each field, so maybe these would be hidden for each field and require you to click a link or icon next to each field to display the dropdown.
Redactions should not be allowed for Public Access Level datasets
We should also change the labels for the export buttons to: