department-of-veterans-affairs / va.gov-team

Public resources for building on and in support of VA.gov. Visit complete Knowledge Hub:
https://depo-platform-documentation.scrollhelp.site/index.html
282 stars 203 forks source link

Veteran Forms - Sentry PII Incident Summary #63178

Open ndsprinkle opened 1 year ago

ndsprinkle commented 1 year ago

Description

Our team recently leaked PII data on Sentry due to an unexpected error that occurred inside our vets-api module.

Our module: https://github.com/department-of-veterans-affairs/vets-api/tree/master/modules/simple_forms_api The line in our PDF filler that specifically threw the error: https://github.com/department-of-veterans-affairs/vets-api/blob/master/modules/simple_forms_api/app/services/simple_forms_api/pdf_filler.rb#L31

Because that line of code was dealing with the form data when the error happened, it automatically logged the context of the situation, which includes the form data containing the PII data.

Since this incident, our team has taken steps to prevent this from happening again by implementing a few code changes.

Due to our confusions around what is logged to Sentry and how to control it, as it seems like every error in vets-api is logged to Sentry, we implemented a solution that intercepts errors being raised by our module and purges all form data from the error message. In fact, our error message is completely customized and includes no information related to the user beyond the ID of the form. There is a built in PII sanitizer (https://github.com/department-of-veterans-affairs/vets-api/blob/master/lib/sentry/processor/pii_sanitizer.rb) in the vets-api repository, but it didn't sanitize any of our form data. Other theories on why this happened will be further along in this issue.

We also addressed the root of the error, which related to escaped characters not being processed properly by the ERB result method, by performing character replacement on the form data in our vets-website submit transformer prior to sending the form data to our vets-api endpoint. This character replacement removes and replaces all escaped characters with non-escaped counterparts. While vets-website, vets-api, and JSON has no issue parsing data with escaped characters, the ERB module in vets-api that we are using cannot convert the data properly. This solution of replacing characters is an adequate solution, though we'd like to figure out a method of keeping all characters entered by the veteran at some point in the future.

While this whole incident was a complete accident, our team definitely still has some confusions and concerns around the situation.

Most of our confusions come from Sentry. While platform provides two pages of documentation related to Sentry (https://depo-platform-documentation.scrollhelp.site/developer-docs/sentry-tagging-standards) (https://depo-platform-documentation.scrollhelp.site/developer-docs/tracking-application-errors-with-sentry), neither of these really detail or explain what Sentry captures and how to control/prevent what Sentry captures. The documentation primarily explains how to monitor what Sentry captures after it is captured. We also saw the documentation on PII that linked to the sanitizer that I linked above, but again it appears that it didn't work properly in our situation.

As for my theory on why the sanitizer didn't work for us, I believe it's because of the state of the form data inside of the ERB result method. At the point the error threw, the data was in a "stringified" state (not exactly the same as JSON stringify) and therefore the sensitive keys/fields could not be identified. Again, just a theory, it may be related to the fact that it happened inside the ERB module as well which caused some sort of scope issue with the Sentry module? Not totally sure.

To loop back to Sentry, I think it would extremely helpful to have some additional guidance in documentation somewhere that explains what gets logged to Sentry, even if the answer is everything, and what steps we could take to stop things from being logged. Maybe it even makes sense to have applications be disabled from logging to Sentry by default? And it has to be manually enabled later? Because I think it's safe to assume that any app could throw an unexpected error, even with many layers of front and back end tests and validations, like in our situation. Something can always slip through, and when it does, it would be comforting to know it doesn't immediately get logged to Sentry along with the entire context of the error. Side thought: have there been any proposed alternatives to Sentry that are FedRAMPed that could handle PII data? Because that would also make this entire logging process a lot less worrisome.

alyssagallion commented 1 year ago

Checked in with Clint who noted we should send to Joe Tice's team. Seemingly there is a request for documentation but also to answer the following questions:

CC: @bdech @JoeTice

alyssagallion commented 1 year ago

@bdech Hey there! I put this into our sprint work to work with Joe so we don't lose it. @JoeTice If there is an update I missed let me know (i.e. your team is already working on it).

alyssagallion commented 11 months ago

Alyssa to follow up with Joe Tice.

alyssagallion commented 10 months ago

@JoeTice Do you need anything from us on this?

alyssagallion commented 5 days ago

@JoeTice Checking in.