pystorm / streamparse

Run Python in Apache Storm topologies. Pythonic API, CLI tooling, and a topology DSL.
http://streamparse.readthedocs.io/
Apache License 2.0
1.5k stars 218 forks source link

Serialize error java.lang.NumberFormatException with long type #368

Closed fedelemantuano closed 7 years ago

fedelemantuano commented 7 years ago

Hi,

I have an serialize issue in my application. I think that this is not an issue of streamparse, but of org.apache.storm.multilang.JsonSerializer of Apache Storm. I want to ask you, if there is a method to solve this issue.

In my application I get a report from a source that return a dict. In Python I can use json library to serialize it without issue, but in my application I don't serialize the object but send it in topology. In this case I have a long:

In [26]: report["data"][0]["ssl"]["cert"]["serial"]
Out[26]: 16501247490175997961L
In [27]: type(report["data"][0]["ssl"]["cert"]["serial"])
Out[27]: long

and with this long I have the following error

2017-05-20 20:39:50.483 o.a.s.t.ShellBolt Thread-45 [ERROR] Halting process: ShellBolt died. Command: [streamparse_run, -s json bolts.network.Network], ProcessInfo pid:9898, name:network exitCod
e:-1, errorString:
java.lang.NumberFormatException: For input string: "16501247490175997961"
        at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[?:1.8.0_131]
        at java.lang.Long.parseLong(Long.java:592) ~[?:1.8.0_131]
        at java.lang.Long.valueOf(Long.java:803) ~[?:1.8.0_131]
        at org.apache.storm.shade.org.json.simple.parser.Yylex.yylex(Unknown Source) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.shade.org.json.simple.parser.JSONParser.nextToken(Unknown Source) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.shade.org.json.simple.parser.JSONParser.parse(Unknown Source) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.shade.org.json.simple.parser.JSONParser.parse(Unknown Source) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.shade.org.json.simple.parser.JSONParser.parse(Unknown Source) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.shade.org.json.simple.JSONValue.parseWithException(Unknown Source) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.multilang.JsonSerializer.readMessage(JsonSerializer.java:170) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.multilang.JsonSerializer.readShellMsg(JsonSerializer.java:104) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:125) ~[storm-core-1.1.0.jar:1.1.0]
        at org.apache.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:352) [storm-core-1.1.0.jar:1.1.0]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]

Is there a methon to bypass it?

Thanks a lot

dan-blanchard commented 7 years ago

@fedelemantuano, in Java the largest value you can specify for a Long is 9223372036854775807. The simplest workaround is to convert these values to strings before you emit them and then convert them back to long/int when you need to in the next component in your topology.

fedelemantuano commented 7 years ago

Ok. Thanks a lot for your​ suggest.