medic / cht-core

The CHT Core Framework makes it faster to build responsive, offline-first digital health apps that equip health workers to provide better care in their communities. It is a central resource of the Community Health Toolkit.
https://communityhealthtoolkit.org
GNU Affero General Public License v3.0
439 stars 210 forks source link

Haproxy logs - serving as audit trail - truncate logged body when encoding is chunked #8182

Open dianabarsan opened 1 year ago

dianabarsan commented 1 year ago

Describe the bug We are using haproxy logs as an auditing tool. However, haproxy has a severe limitation where it will not log the full body of a long post request.

To Reproduce Steps to reproduce the behavior:

  1. Load up a 4.x instance (All containers) with seed data from scalability test.
  2. Login as an offline user.
  3. Set yourself as offline.
  4. Use the bulk delete feature to delete 20 reports (number is random, but make sure payload is considerable). The deletion in bulk only serves as a quick way of creating a large-ish payload to send to the server. You could equally create many docs or edit many docs, the used endpoints will be the same.
  5. Wait until process is complete
  6. Start watching haproxy logs.
  7. Reload the webapp page, set as offline and sync.
  8. Inspect the haproxy logs for _bulk_docs requests. Look at the body that is logged. Copy the body and load it up in a JSON parser.
  9. See that JSON is incomplete.

Expected behavior We rely on haproxy logs as an audit trail. This means that the whole body of the write request should be logged. It is not.

Logs Example of logged haproxy _bulk_docs that truncates body: https://gist.githubusercontent.com/dianabarsan/205d23ef1761812ef880c4a6990ecdd6/raw/f1f3e644791d2977aa4ff94597838094a9dead51/gistfile1.txt

Environment

Additional context This is related to haproxy logging on request received, not on request complete. If the request body is chunked, only the first chunk will be logged.

https://docs.haproxy.org/2.0/configuration.html#7.3.6-req.body

This returns the HTTP request's available body as a block of data. It requires that the request body has been buffered made available using "option http-buffer-request". In case of chunked-encoded body, currently only the first chunk is analyzed.

dianabarsan commented 1 year ago

I've looked for a solution for this, but haven't found anything that would have haproxy log after the full request was received, and maybe rightly so? It may be the case we reevaluate which tool we use for auditing. I recently needed to inspect a suspicious doc change, where scanning audit logs would have been helpful. Unfortunately, body content was truncated and I could not see the change I was interested in.