darold / pgbadger

A fast PostgreSQL Log Analyzer
http://pgbadger.darold.net/
PostgreSQL License
3.49k stars 349 forks source link

JSON format file can't be parsed correctly #757

Closed sszhao1020 closed 1 year ago

sszhao1020 commented 1 year ago

we exported the Postgres logs from Azure Log Analytics Workspaces to JSON format, then it can't be parsed correctly. 1.json.zip

❯ ./pgbadger -f jsonlog 1.json -o 1.html [========================>] Parsed 299891 bytes of 299891 (100.00%), queries: 0, events: 0 LOG: Ok, generating html report... ~/pgbadger master !1 ?8 ❯

If we export the logs to CSV format, then it could work well. However, CSV has some limitation. So is there any method to resolve the JSON format parsing issue?

darold commented 1 year ago

Some people at Microsoft could have the good idea to sponsor pgbadger which could help their users to use it with their products, but ...

robinportigliatti commented 1 year ago

Hello,

A workaroud I found. The tricky part is to get the DETAIL part of the LOG. They managed to make it ugly, you'll see.

With this log_line_prefix '%m [%p]: [%l-1] xact=%x,user=%u,db=%d,client=%h, app=%a' and by splitting with LOG: I managed to do someting:

Json file:

{
   "properties":{
      "timestamp":"2023-08-17 10:00:08.450 UTC",
      "processId":28218,
      "errorLevel":"LOG",
      "sqlerrcode":"00000",
      "message":"2023-08-17 10:00:08.448 UTC [28218]: [61567-1] xact=0,user=robin,db=dalibo,client=[local], app=[unknown]LOG:  duration: 1.611 ms  bind <unnamed>: SELECT * FROM human WHERE firstname = '$1' AND lastname = '$2'",
      "detail":"parameters: $1 = 'FR', $2 = '0201752211'"
   },
   "time":"2023-08-17T10:00:08.452Z",
   "resourceId":"XXX",
   "location":"XXX",
   "category":"PostgreSQLLogs",
   "operationName":"LogEvent"
}

And by executing something like this:

jq -r 'select(.properties? and .properties.message?) | (.properties.message | split("LOG:")[0]) as $prefix | if $prefix then $prefix + "LOG:" + (.properties.message | split("LOG:")[1]) else "" end, if .properties.detail? then $prefix + "DETAIL: " + .properties.detail else "" end' PT1H.json > PT1H.log

You get the following output:

2023-08-17 10:00:08.448 UTC [28218]: [61567-1] xact=0,user=robin,db=dalibo,client=[local], app=[unknown]LOG:  duration: 1.611 ms  bind <unnamed>: SELECT * FROM human WHERE firstname = '$1' AND lastname = '$2'
2023-08-17 10:00:08.448 UTC [28218]: [61567-1] xact=0,user=robin,db=dalibo,client=[local], app=[unknown]DETAIL: parameters: $1 = 'FR', $2 = '0201752211'

The downside is there is no counter %l-2.

And now we can execute pgBdager:

pgbadger --jobs=4 -I --prefix="%m [%p]: [%l-1] xact=%x,user=%u,db=%d,client=%h, app=%a" PT1H.log -O /tmp/pgbadger_rapport

But pgBadger doesn't matter and now we have details for each of our queries.

Hope that helps anyone who was struggling as hard as I was.

Robin,