sipcapture / homer7-docker

HOMER 7 Docker Images
92 stars 84 forks source link

ERROR: invalid input syntax for type json #66

Closed husam212 closed 4 years ago

husam212 commented 4 years ago

I keep getting the following error:

postgres    | 2020-03-17 11:15:27.832 UTC [66] ERROR:  invalid input syntax for type json
postgres    | 2020-03-17 11:15:27.832 UTC [66] DETAIL:  Token "bmGhEfDoNeJ0m" is invalid.
postgres    | 2020-03-17 11:15:27.832 UTC [66] CONTEXT:  JSON data, line 1: ...","correlation_id":"M_eP0431313170w"bmGhEfDoNeJ0m...
postgres    |   COPY hep_proto_1_call, line 80, column protocol_header: "{"protocolFamily":2,"protocol":17,"srcIp":"80.90.160.221","dstIp":"159.65.125.17","srcPort":5060,"ds..."
postgres    | 2020-03-17 11:15:27.832 UTC [66] STATEMENT:  COPY hep_proto_1_call(sid,create_date,protocol_header,data_header,raw) FROM STDIN
heplify     | 2020/03/17 11:15:27.833508 postgres.go:284: ERR pq: invalid input syntax for type json
heplify     | 2020/03/17 11:15:27.834590 postgres.go:292: ERR pq: Could not complete operation in a failed transaction

I'm using a simple Homer stack with heplify version 1.56, following is the compose:

version: "2.1"

services:
  homer:
    container_name: homer
    image: sipcapture/webapp:latest
    ports:
      - "80:80/tcp"
    environment:
      - "DB_HOST=postgres"
      - "DB_USER=root"
      - "DB_PASS=homerSeven"
    restart: unless-stopped
    depends_on:
      - heplify

  heplify:
    container_name: heplify
    image: sipcapture/heplify-server:latest
    ports:
      - "9060:9060/tcp"
      - "9060:9060/udp"
    command:
      - "./heplify-server"
    environment:
      - "HEPLIFYSERVER_HEPADDR=0.0.0.0:9060"
      - "HEPLIFYSERVER_HEPTLSADDR=0.0.0.0:9060"
      - "HEPLIFYSERVER_DBSHEMA=homer7"
      - "HEPLIFYSERVER_DBDRIVER=postgres"
      - "HEPLIFYSERVER_DBADDR=postgres:5432"
      - "HEPLIFYSERVER_DBUSER=root"
      - "HEPLIFYSERVER_DBPASS=homerSeven"
      - "HEPLIFYSERVER_DBDATATABLE=homer_data"
      - "HEPLIFYSERVER_DBCONFTABLE=homer_config"
      - "HEPLIFYSERVER_DBDROPDAYS=30"
      - "HEPLIFYSERVER_LOGLVL=info"
      - "HEPLIFYSERVER_LOGSTD=true"
      - "HEPLIFYSERVER_DEDUP=true"
    restart: unless-stopped
    depends_on:
      - postgres

  postgres:
    container_name: postgres
    image: postgres:11-alpine
    restart: always
    environment:
      POSTGRES_PASSWORD: homerSeven
      POSTGRES_USER: root
    expose:
      - 5432
    restart: unless-stopped
    volumes:
      - ./postgres-init.sh:/docker-entrypoint-initdb.d/init.sh
      - ./postgres-data:/var/lib/postgresql/data
lmangani commented 4 years ago

Sounds like some unescaped character producing invalid JSON objects. We need the full debug output from postgres to understand - please attach full log lines if possible.

husam212 commented 4 years ago

I tried to log to file using logging_collector=on command option and setting verbosity to verbose in postgresql.conf, I got the following in the log file, it is also trimmed, but it seems the double quotations character isn't escaped.

2020-03-17 12:27:30.578 UTC [39] ERROR:  22P02: invalid input syntax for type json
2020-03-17 12:27:30.578 UTC [39] DETAIL:  Expected "," or "}", but found ""bU5132714170C7biGhEfCdFeE0h@BC00.HQBC01.MSS.ZAIN.JO"".
2020-03-17 12:27:30.578 UTC [39] CONTEXT:  JSON data, line 1: ...132714170C7biGhEfCdFeE0h@BC00.HQBC01.MSS.ZAIN.JO"...
    COPY hep_proto_5_default, line 37, column protocol_header: "{"protocolFamily":2,"protocol":17,"srcIp":"159.65.125.17","dstIp":"80.90.160.221","srcPort":14629,"d..."
2020-03-17 12:27:30.578 UTC [39] LOCATION:  report_parse_error, json.c:1206
2020-03-17 12:27:30.578 UTC [39] STATEMENT:  COPY hep_proto_5_default(sid,create_date,protocol_header,data_header,raw) FROM STDIN

Is there a way to configure PostgreSQL to log the full JSON body?

lmangani commented 4 years ago

DETAIL: Expected "," or "}", but found ""bU5132714170C7biGhEfCdFeE0h@BC00.HQBC01.MSS.ZAIN.JO"".

So there are doublequotes in the call-id apparently?

husam212 commented 4 years ago

@lmangani I think that what's causing the issue, is this a bug in heplify or heplify-server? Or neither?

lmangani commented 4 years ago

@husam212 hard to tell without seeing the originals.... check the logs for heplify and heplify-server both also see if they print any useful error - also if this is frequent and about the same IPs perhaps you can try tcpdump it and show us the message behind it.

husam212 commented 4 years ago

I tried using hepop instead of heplify-server, I get the same error but hepop logs the full query, uploaded here (EDIT: Link removed).

heplify side seems to be working fine, no errors there.

I'm trying to capture at lower level an analyze with wireshark.

lmangani commented 4 years ago

Really nasty stuff in there for callid fields:

(`aP0004916170/vbdGhEfFhGmJ0m@BC00.HQBC01.MSS.ZAIN.JO"
"?(aT4024916170ClblGhEfDrKmF0i@BC00.HQBC01.MSS.ZAIN.JO"

What are we looking at exactly? What version of heplify are you using?

husam212 commented 4 years ago

I'm using version 1.56, I tried going back to 1.47 just for testing, same issue happened.

lmangani commented 4 years ago

What is generating this traffic, precisely? My bet is on x.x.125.17 which is always in the to/from path of broken ids, or whoever generates those Call-IDs - If you think this is due to heplify you can give captagent a try to see if it makes any difference or we'll need to see the raw traffic to/from x.x.125.17 or any other host shelling out those weird prefix characters breaking line.

husam212 commented 4 years ago

I gave captagent a try, same issue.

By looking at the raw traffic I found the weird characters in the raw Call-IDs, so I don't think it is heplify or captagent bug. The weird characters aren't that weird too, all of them are valid UTF-8 ones, so I think the problem at heplify-server side not escaping doublequotes?

I noticed that all the errors in protocol_header, looking at the code (I don't know Go) I found this line which seems to be adding the Call-ID to JSON body but without escaping it, while in makeSIPDataHeader it does escape it , could that be the problem?

lmangani commented 4 years ago

@husam212 please attach a raw example and we'll try analyze it

husam212 commented 4 years ago

Here (EDIT: Link removed) is the raw example, exported from Wireshark as text and hex.

husam212 commented 4 years ago

I'm more confident that the reason of the issue is escaping correlation ID, I tried to override the default correlation ID header using HEPLIFYSERVER_ALEGIDS environment variable and set it to a different header containing alphanumeric random string (no doublequotes or line breaks), the error stopped showing in the log.

I started another issue at https://github.com/sipcapture/heplify-server/issues/346 since this issue is related to heplify-server.