gflewis / sndml3

ServiceNow Data Mart Loader: an application to load SQL databases from ServiceNow
MIT License
30 stars 19 forks source link

JsonParseException Unexpected character ('}' (code 125)): was expecting a colon to separate field name and value #37

Closed jonathangull closed 2 years ago

jonathangull commented 2 years ago

Came across this while copying the "task" table.

06:21:08 INFO RestTableAPI [DPRUN0001011] DPRUN0001011 PROCESS: getStats count=173211 query= 06:21:08 INFO AppProgressLogger [DPRUN0001011] DPRUN0001011 INIT: logStart 173211 06:21:08 INFO RestTableReader [DPRUN0001011] DPRUN0001011 INIT: Starting (173211 rows) 06:22:09 ERROR AppJobRunner [DPRUN0001011] DPRUN0001011 ERROR: sndml.daemon.AppJobRunner.call: com.fasterxml.jackson.core.JsonParseException com.fasterxml.jackson.core.JsonParseException: Unexpected character ('}' (code 125)): was expecting a colon to separate field name and value at [Source: (StringReader); line: 1, column: 12623778] at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:2337) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:710) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:635) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._skipColon2(ReaderBasedJsonParser.java:2255) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.core.json.ReaderBasedJsonParser._skipColon(ReaderBasedJsonParser.java:2186) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextFieldName(ReaderBasedJsonParser.java:944) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:269) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:69) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:16) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.databind.deser.DefaultDeserializationContext.readRootValue(DefaultDeserializationContext.java:322) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4635) ~[sndml-3.4.6-mssql.jar:?] at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:3042) ~[sndml-3.4.6-mssql.jar:?] at sndml.servicenow.JsonRequest.execute(JsonRequest.java:46) ~[sndml-3.4.6-mssql.jar:?] at sndml.servicenow.RestTableAPI.getRecords(RestTableAPI.java:113) ~[sndml-3.4.6-mssql.jar:?] at sndml.servicenow.RestTableReader.call(RestTableReader.java:76) ~[sndml-3.4.6-mssql.jar:?] at sndml.datamart.JobRunner.runLoad(JobRunner.java:223) ~[sndml-3.4.6-mssql.jar:?] at sndml.datamart.JobRunner.call(JobRunner.java:99) ~[sndml-3.4.6-mssql.jar:?] at sndml.daemon.AppJobRunner.call(AppJobRunner.java:85) [sndml-3.4.6-mssql.jar:?] at sndml.daemon.AppJobRunner.run(AppJobRunner.java:60) [sndml-3.4.6-mssql.jar:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?] at java.lang.Thread.run(Thread.java:833) [?:?] 06:22:09 ERROR AppStatusLogger [DPRUN0001011] DPRUN0001011 PROCESS: logError JsonParseException 06:22:09 ERROR AgentDaemon [DPRUN0001011] GLOBAL FINISH: Aborting the daemon

gflewis commented 2 years ago

I have occasionally come across errors of this sort while exporting the task table. These errors can be particularly troublesome to diagnose. The problem could be due to a JSON payload that was too large, and got truncated. Notice the error: "line: 1, column: 12623778". The parser was more than 12 million characters into the REST response when it encountered a parsing error. Unfortunately, SNDML is unable to display the sys_id of the problematic record, because it was unable to parse the response.

A couple of suggestions: Sometimes the only way to find the problematic task record is to use successively smaller and smaller CREATED date ranges, until you can narrow in on the record. It may be helpful to use a very small PAGESIZE, and/or to enable more granular logging as described here https://github.com/gflewis/sndml3/wiki/Log4j2. Usually, when you find the sys_id of the record that is causing the problem, you will discover that it is corrupted in some manner. Once you know the sys_id of the problematic record (or records) you can use a FILTER to exclude it (or them).

jonathangull commented 2 years ago

Thanks @gflewis , is it possible for datapump to skip problematic record?

gflewis commented 2 years ago

If you can identify the problematic record, then you can use a FILTER to skip it. However, it is not possible for SNDML3 to automatically skip the record.

The SNDML3 processing sequence is

  1. GET a page (REST API call)
  2. Parse the page (Jackson JSON parser library)
  3. Process each record in the page
  4. Commit the changes to the database
  5. Repeat

Pages are processed in sys_id order. When SNDML3 constructs each GET, the query contains ORDERBYsys_id and sys_id>value clauses where value is the highest sys_id from the previous page.

In this situation the code is failing in the Parse step. As a result, the code does not know the sys_id values in the page that failed. It cannot continue processing because it cannot construct a query for the next page. There is no alternative other than to abort.

In the past, when I have encountered this issue, I have used background scripts to find the problematic record (or records). With SNDML3 you may be able to narrow down the search area and establish that sys_created_on is within a particular range or that sys_id is greater than a particular value. If you use PARTITION, then it may help narrow down when the problematic record was created. Once you have narrowed down the search area, try using a background script to search for a record that appears to be corrupt. The issue is probably in a text field such as description. A good place to start is to look for records where description is longer than 8000 characters. These corrupt records are generally created via email (i.e. an Inbound Action). For example, you may discover that the record was create by a SPAM email.

jonathangull commented 2 years ago

Thank you @gflewis this explanation is very helpful. please close this issue