raft-tech / TANF-app

Repo for development of a new TANF Data Reporting System
Other
17 stars 4 forks source link

Doc/3199 monitoring adr #3210

Closed andrew-jameson closed 4 weeks ago

andrew-jameson commented 1 month ago

Summary of Changes

Provide a brief summary of changes Pull request closes #3199

Please also see #3206

codecov[bot] commented 1 month ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 90.65%. Comparing base (25b762b) to head (5b32245). Report is 2 commits behind head on develop.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210/graphs/tree.svg?width=650&height=150&src=pr&token=BA04YXPAL9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech)](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) ```diff @@ Coverage Diff @@ ## develop #3210 +/- ## ======================================== Coverage 90.65% 90.65% ======================================== Files 299 299 Lines 8490 8490 Branches 794 794 ======================================== Hits 7697 7697 Misses 676 676 Partials 117 117 ``` | [Flag](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210/flags?src=pr&el=flags&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | Coverage Δ | | |---|---|---| | [dev-backend](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | `90.38% <ø> (ø)` | | | [dev-frontend](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210/flags?src=pr&el=flag&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) | `92.66% <ø> (ø)` | | Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech#carryforward-flags-in-the-pull-request-comment) to find out more. ------ [Continue to review full report in Codecov by Sentry](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210?dropdown=coverage&src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech). > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech) > `Δ = absolute (impact)`, `ø = not affected`, `? = missing data` > Powered by [Codecov](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210?dropdown=coverage&src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech). Last update [78ac4eb...5b32245](https://app.codecov.io/gh/raft-tech/TANF-app/pull/3210?dropdown=coverage&src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=raft-tech).
ADPennington commented 1 month ago

@andrew-jameson @lfrohlich @ttran-hub -- below is my feedback related to this ADR, most of these are geared toward trying to anticipate questions from our product and security teams:

diagram-related

  • recommend adding the ATO boundary to make it more clear that this solution will be hosted outside of it

Andrew to denote this in diagram

  • where are the other tools (PLG and associated) in this diagram?

under clamav in diagram plg

  • Does the triangle represent load balancers?

triangle represents the proxies

cost-related

  • Can we lay out the monthly costs associated with hosting Sentry in Cloud.gov compared to using Sentry's SaaS offering?

$26/mo for Sentry SaaS vs $130/mo * 8GB-10GB if we hosted Sentry in our cloud.gov environment

reference: https://sentry.io/pricing/

  • Do we have a preliminary sense of what might increase these monthly costs over time (on the Sentry SaaS side)? What additional expenses (e.g., scaling costs) might be incurred as part of this solution?

could go up to $80/mo (Sentry pricing Business tier) if we need custom dashboards. it's too early to decide this.

security-related

  • Does self-hosting mean that the other tools (PLG stack and associated) will be hosted in Cloud.gov? If so, what are the associated costs, and how do these compare to other hosting options? If not, where will this be hosted, and do we have visibility into its security compliance standards (e.g., encryption, access controls)?

PLG to be hosted in cloud.gov requires about 10GB (*$130/mo), but not sure just yet.

  • What security and compliance standards does the Sentry SaaS adhere to (e.g., SOC 2, FedRAMP, HIPAA)?

SOC 2 (reference: https://sentry.io/security/#third-party-audit)

  • Will any PII or other sensitive data be stored outside of our ATO boundary? If so, please describe how it will be protected.

access control-related

  • Can you describe the types of information to be exchanged between our system’s boundary and Sentry’s SaaS, including what logs, metrics, or other data will be sent or accessed? And what info will not be exchanged?
  • How will sys admins access this information, and will it require accessing data outside of our system’s boundary diagram?

see above

  • If information flows back into our boundary diagram, how are we ensuring that this process doesn’t create a new vector for potential attacks or data leaks?

mostly outbound communication (not sure about inbound yet - TBD spike)

  • What access controls and monitoring mechanisms will be in-place for these external systems to detect and prevent unauthorized access?

must go through nginx to reach our backend. if sentry compromised... incident response (https://sentry.io/security/#corporate-security) we may need our own IR plan.

@lfrohlich TDP IPT (external) is next Wednesday -- dev plans to present this. cc: @andrew-jameson @vlasse86 @ttran-hub

andrew-jameson commented 1 month ago

The additional info above is helpful but @andrew-jameson could you summarize what the expected cost would be, or the range of costs with all components considered?

@lfrohlich On low end, total would be $806/mo.

Per @elipe17, the range for PLG is approx 6-10GB of RAM total to cover all 3 spaces; above quote is for 6GB. Sentry would be $26/mo flat fee.

We need to actually calculate memory requirements for PLG. My assumption is Loki: 2-4GB, Prometheus: 2-4GB (calculator), Grafana: 1GB, 3 PG exporters: 3 24MB, 6 Backend Promtails: 6 64MB.

elipe17 commented 1 month ago

The additional info above is helpful but @andrew-jameson could you summarize what the expected cost would be, or the range of costs with all components considered?

@lfrohlich On low end, total would be $806/mo.

Per @elipe17, the range for PLG is approx 6-10GB of RAM total to cover all 3 spaces; above quote is for 6GB. Sentry would be $26/mo flat fee.

We need to actually calculate memory requirements for PLG. My assumption is Loki: 2-4GB, Prometheus: 2-4GB (calculator), Grafana: 1GB, 3 PG exporters: 3 24MB, 6 Backend Promtails: 6 64MB.

Please see my updated memory requirements here.

andrew-jameson commented 1 month ago

Please see my updated memory requirements here.

Thank you @elipe17! New low-end cost would be $416/mo @lfrohlich

ADPennington commented 1 month ago

Please see my updated memory requirements here.

Thank you @elipe17! New low-end cost would be $416/mo @lfrohlich

low end: 3GB x $130 (RAM cost per GB)+ $26 (Sentry) = $416/mo high end (doubling GB estimate for budgeting purposes): 6GB x $130 + $26 = $806/mo

@lfrohlich