Closed rianfowler closed 4 years ago
@gunsch CSP issue. I think the tools team can keep it in the backlog until we get back to full strength on devops. Would be really interested in your review of our existing rules.
cc @brandonrapp
Couple resources here:
my thoughts on what we should change/evaluate:
unsafe-inline
was the one that scared me the most, but the tool mentions that in modern browsers (CSP2) nonce-*
overrides it / cancels it out. The recommended policy still includes it for backwards-compatibility.http://search.usa.gov
--- can we upgrade to https
?strict-dynamic
(might require being more deliberate about usage of nonce
attributes, i.e. on the script tags where we link out to external scripts, though I'm not 100% on that), then that effectively simplifies the whitelist for modern browsers --- that is, we shouldn't actually drop the whitelist, but it makes everything simpler for browsers that do recognize that directive.unsafe-eval
or can we drop it?Next steps:
Rollout plan:
Meeting notes 24 Sep 2020:
To discuss:
Notes
Tickets:
I'm interested in continuing to track the CSP work, but I see that CSP is still not enabled in prod on www.va.gov.
Is there another ticket tracking overall enablement of CSP, or should this be reopened?
@jkassemi commented on Tue Sep 11 2018
After reporting is gathered from #13053, we will need to evaluate and adjust our content security policies to permit all valid production requests.
Risk if not complete
CSP is like a client-side firewall. We open up a vector for security issues on the client.
@rianfowler commented on Tue Mar 19 2019
bumping this for prioritization @wyattwalter @wunderhund @rtluu
@wunderhund commented on Tue Mar 19 2019
All that needs to be done is to change the
csp_report_only:
parameter here tofalse
: https://github.com/department-of-veterans-affairs/devops/blob/master/ansible/deployment/config/revproxy-vagov/vars/nginx_config_vagov-prod.yml#L9This should be done in dev and staging first to identify any issues. The content security policies that will be applied are located in these files: https://github.com/department-of-veterans-affairs/devops/blob/master/ansible/deployment/config/revproxy-vagov/vars/content_security_policy_vagov-prod.yml https://github.com/department-of-veterans-affairs/devops/blob/master/ansible/deployment/config/revproxy-vagov/vars/content_security_policy_vagov-staging.yml https://github.com/department-of-veterans-affairs/devops/blob/master/ansible/deployment/config/revproxy-vagov/vars/content_security_policy_vagov-dev.yml
@wyattwalter commented on Tue Mar 26 2019
We need to be careful that these headers don't affect TeamSite pages as we don't have a great way to audit outside assets that they may be loading.
@wyattwalter commented on Tue Mar 26 2019
@omgitsbillryan @njdister this would be a good one to take on next if you are blocked/done on other tickets.
@njdister commented on Wed Mar 27 2019
This change looks relatively simple. What would be the appropriate teams/POCs to coordinate with before getting this pushed out in dev + staging @wyattwalter @wunderhund ?
@wunderhund commented on Wed Mar 27 2019
@njdister I'd recommend coordinating with the VA.gov frontend devs like @ncksllvn or @cvalarida; your best bet might be to post in the #vetsgov-fe channel in DSVA Slack.
@ncksllvn commented on Thu Mar 28 2019
One of the complications we had with this at launch was that we had two different third-party survey tools (Foresee and Medallia) loading assets out of a single JS script that appended elements dynamically with JS/img/etc sources in different subdomains. It seemed like every time we got them working, we found another asset being blocked by our CSP and had to register another subdomain. We only have one survey tool now, Foresee, and now that it's been in effect for awhile we may have more luck.
Anything else coming to mind for this? Foresee is the big one coming to mind, but there's probably more.
(Don't get me wrong - we definitely need our CSP enforced)
@rianfowler commented on Thu Mar 28 2019
Foresee documents some CSP headers here: https://developer.foresee.com/docs/foresee-on-premises-deployment/
@ncksllvn commented on Thu Mar 28 2019
Monitoring the network panel for VA.gov now it looks like Foresee loads everything out of
https://gateway.foresee.com
, so maybe it's that simple now and Medallia was where the real complications were.Maybe we could start by enforcing our CSP this in Dev/Staging and see what happens?
@rianfowler commented on Thu Mar 28 2019
We can roll out with a report only header if we have a way to receive CSP reports.
@njdister commented on Thu Mar 28 2019
A report only header should be in place per #13053
@ncksllvn commented on Thu Mar 28 2019
Report-only is enabled on https://dev.va.gov, which is why we have this gorgeous output -
I don't see that in Staging or Prod though.
@rianfowler commented on Thu Mar 28 2019
Ah yeah I remember now. I think Wyatt told me we only assign the report-url on a small percentage of the headers to throttle the amount of reports we receive.
@njdister commented on Thu Mar 28 2019
I did some curl testing and the
Content-Security-Policy-Report-Only
header is present in dev and staging, but not prod.Should we enable report only header in prod first before rolling out the non-report header through the environments?
@rianfowler commented on Fri Mar 29 2019
My understanding is that the report-only header is for testing new rules. You add the new rule to the report only header to ensure it doesn’t break the site before adding it to the header that’s enforced. I don’t think we have updated the CSP enough to have a process around changing them.
The ability to push report-only should be present in prod (maybe optionally pushed if we’re evaluating new rules). Ideally, rules under evaluation are merged with prod rules so it’s easy to tell from the configuration that we have rules under evaluation.
@njdister commented on Tue Apr 02 2019
@wyattwalter and I determined the nginx config needs refactoring before we proceed with rolling this out
@wyattwalter commented on Fri Apr 05 2019
Adding the list of tasks that we made. There's a set of configuration options that are conflicting and causing CSP to be enabled in dev when we think the intent was to leave it off.
@njdister commented on Thu Apr 04 2019
@ncksllvn @rianfowler We need your help going through the staging CSP reports to determine what is expected, what isn't, and where we might need new rules before setting enforcement: http://sentry.vetsgov-internal/vets-gov/website-staging/?query=logger%3Acsp
@rianfowler commented on Mon Jun 17 2019
went through the first page of errors sorted by recent and then sorted by volume. I'm going to try to resolve some of the really old last seens and clean it up some more but here are some things I think we'll need to do:
figure out a way to not have the CSP headers on proxy from EWIS
. should resolve (some of these are not /health):http://sentry.vetsgov-internal/vets-gov/website-staging/issues/49437/?query=logger:csp
Aaron Wieczorek is the point of contact on this and let me know it continue to be up for a bit
resolves: http://sentry.vetsgov-internal/vets-gov/website-staging/issues/59627/?query=logger:csp
img-src
script-src
andimg-src
strict-dynamic
toscript-src
beforeunsafe-inline
(not sure if the order matters)script-src
. this seems like a solid list though so I've included it in the abovestrict-dynamic
to ourscript-src
directive beforeunsafe-inline
unsafe-inline
directive by changing the way we use google analytics and tag manager sincestrict-dynamic
isn't supported on older browsershttp://sentry.vetsgov-internal/vets-gov/website-staging/issues/6143/?query=logger:csp
this came up in the CSP evaluator
connect.src
.img-src
http://sentry.vetsgov-internal/vets-gov/website-staging/issues/45429/?query=logger:csp
[x] http://sentry.vetsgov-internal/vets-gov/website-staging/issues/49575/?query=logger:csp
@njdister commented on Thu Apr 04 2019
From what I can tell the review instances do not go through revproxy, we need to investigate why they are sending CSP reports to the staging site in sentry, and if that can be disabled
@njdister commented on Fri Apr 05 2019
Taylor discovered the review instances are sending CSP reports to the staging website in Sentry (and regular JS reports as well): https://github.com/department-of-veterans-affairs/devops/blob/master/ansible/roles/review-instance-configure/meta/main.yml#L130
@rianfowler @ncksllvn Is there any particular reason for this? Can it be disabled?
@rianfowler commented on Fri Apr 05 2019
@njdister I don't think the review instances should have a CSP. I think using the
report-only
mode in production is the best way to test new rules.@njdister commented on Tue Apr 09 2019
Paths that get proxied to EWIS and review instances have been excluded from CSP headers/report sending. Any old review instances deployed before changes yesterday may still send CSP reports and it may take up to a week for all old instances to get cleaned up.
@ncksllvn commented on Mon Apr 15 2019
On a Staging TeamSite page, seeing the following breaking errors -
Looks like the S3 bucket containing the injected resources is being blocked, minus the injection script itself. I haven't been following this discussion every step so I don't know if this is to be expected, but FYI.
@njdister commented on Mon Apr 15 2019
@ncksllvn I don't believe this was yet addressed or affected by any of the CSP updates to revproxy thus far. Is it intended/correct behavior that the staging site is requesting assets from the prod assets bucket?
@ncksllvn commented on Mon Apr 15 2019
@njdister Hm interesting...It is intended behavior, originally implemented prior to launch as a precaution for heavy traffic so there'd be less weight on our servers.
@rianfowler commented on Tue Apr 16 2019
This should be resolved when we stop adding the CSP to the Teamsite pages through the rev proxy.
@ncksllvn This should work when testing on staging: https://staging.va.gov/health/?targetEnvironment=vagovstaging. The trade off is that it will still run the production polyfill and settings files but the main app is short circuited and the staging assets (including the polyfill, settings, and css) are added by the main app script.
@njdister commented on Tue Apr 16 2019
@ncksllvn @rianfowler Upon closer review, the above errors are caused by CORS not CSP headers. I would suggest a separate ticket be opened for that to avoid confusion.
@wyattwalter commented on Wed May 01 2019
Hey @rianfowler are we good to start enforcing in staging to move this forward? I'm not sure who the decision maker or driver is here, so if there's someone else who should be making that call that's fine too.
@rianfowler commented on Wed May 01 2019
@wyattwalter Yeah we can roll it out.
report-only
CSP and I assume we're rolling out to production the same wayreport-only
and give it a week? before enforcing the new rules.how does this sound?
@rianfowler commented on Wed May 01 2019
I'll be out next week but @jbalboni can help with monitoring or resolving CSP issues this while I'm out.
@wyattwalter commented on Wed May 01 2019
Oh! For some reason I thought that prod already had reporting turned on.
Yeah, let's do that! @njdister this is only a couple options to flip in the revproxy config now, correct?
@njdister commented on Wed May 01 2019
So just to be clear we can proceed with:
?
@wyattwalter commented on Wed May 01 2019
That's my understanding, yeah.
@rianfowler commented on Wed May 01 2019
I think we should just mirror them. The testing process for CSP changes shouldn't fork what's deployed by environment but by what's in
report-only
vs enforced.so
report-only
in both environments.@njdister commented on Fri May 03 2019
@rianfowler Just to be clear, we don't have a way to selectively set report vs enforce per rule. It's all report or all enforce per environment, and it would take significant refactoring to change that.
@rianfowler commented on Wed Jul 03 2019
@njdister missed your last comment
That should work. I think since we don't update these very often, we can test CSP changes in staging.
report-only
as a final validation. I would like to triage the production CSP issues with the new rule set.