robertdavidgraham / masscan

TCP port scanner, spews SYN packets asynchronously, scanning entire Internet in under 5 minutes.
GNU Affero General Public License v3.0
23.69k stars 3.07k forks source link

something different about -oJ json format #314

Open JohnsonYan opened 6 years ago

JohnsonYan commented 6 years ago

Hello, In Json format output, if the last line end with ',' , when parsing json file, I got an error msg.

joydragon commented 6 years ago

+1

schrodyn commented 6 years ago

I have this issue too, rather annoying when trying to parse JSON reports. I have had to create a tool to fix each output file.

zeepi222 commented 6 years ago

+1 Having same issue with the trailing ',' in the JSON output https://jsonlint.com also considers the ouput as invalid JSON

BBerastegui commented 6 years ago

+1 here. The json output is not properly formatted.

I made this ugly hack in my python post-processing for cleaning that last trailing comma:

with open("/tmp/masscan_output.json") as f:
    raw_data = f.read()
    # Remove the whistespaces and the last ",]", and restore the "]"
    json_data = json.loads("".join(raw_data.split()).rstrip(",]") + str("]"))
mzpqnxow commented 6 years ago

Agree that the trailing comma needs to be fixed, it’s very annoying and I am among many who have submitted PRs to fix this because it’s extremely annoying.

That said, I’d like to suggest that the format remain as JSON-Lines format as it original ally was early on in development, rather than one JSON blob. It leads to less memory pressure and general performance issues when loading a huge scan all in one json.load() operation, like when your JSON file is 100MB.

Basically, the original format that everyone complained about as being invalid JSON, which was JSON-Lines. Sorry, it’s off topic, but worth mentioning. Processing one line at a time for huge scans seems to be surprisingly faster in my experience though I suppose it depends on your JSON implementation and language.

This format is what masscan originally used, in case anyone doesn’t know/recall.

On Mon, Jun 25, 2018 at 06:08 Borja Berastegui notifications@github.com wrote:

+1 here. The json output is not properly formatted.

I made this ugly hack in my python post-processing for cleaning that last trailing comma:

"".join(raw_data.split()).rstrip(",]") + str("]")

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/robertdavidgraham/masscan/issues/314#issuecomment-399901028, or mute the thread https://github.com/notifications/unsubscribe-auth/AHpRZIq3RMDP9VG8yz5iw4yO_Zx56MsGks5uALa3gaJpZM4Q0Dgr .

ravenium commented 5 years ago

Fixed in 1.0.5+, per release notes.

superorc commented 5 years ago

still getting invalide json

[root@4b1da7dd8a65 masscan]# file all-scan8443.json 
all-scan8443.json: ASCII text, with very long lines
[root@4b1da7dd8a65 masscan]# bin/masscan --version

Masscan version 1.0.6 ( https://github.com/robertdavidgraham/masscan )
Compiled on: Sep 13 2019 04:35:19
Compiler: gcc 4.8.5 20150623 (Red Hat 4.8.5-36)
OS: Linux
CPU: unknown (64 bits)
GIT version: 1.0.5-86-ga025970
[root@4b1da7dd8a65 masscan]# bin/masscan 0.0.0.0/0 -p8443 --banners --rate 10000000 -oJ ./all-scan8443.json --exclude 255.255.255.255
paalbra commented 3 years ago

I'm confused when it comes to the versioning. Is this issue fixed in some release/commit or not? :S

On CentOS 8:

[root@8d6b7c21b06f ~]# dnf list --installed masscan
Installed Packages
masscan.x86_64                         1.0.5-2.el8                         @epel
[root@8d6b7c21b06f ~]# masscan --version

Masscan version 1.0.4 ( https://github.com/robertdavidgraham/masscan )
Compiled on: Jan 13 2020 14:16:23
Compiler: gcc 8.3.1 20190507 (Red Hat 8.3.1-4)
OS: Linux
CPU: unknown (64 bits)
GIT version: unknown

@superorc comment above mention both Masscan version 1.0.6 and GIT version: 1.0.5-86-ga025970

paalbra commented 3 years ago

Looks like it's resolved on a Fedora machine with:

$ dnf list --installed masscan
Installed Packages
masscan.x86_64                        1.3.1-2.fc33                        @updates
$ masscan --version

Masscan version 1.3.1 ( https://github.com/robertdavidgraham/masscan )
Compiled on: Jan 27 2021 00:00:00
Compiler: gcc 10.2.1 20201125 (Red Hat 10.2.1-9)
OS: Linux
CPU: unknown (64 bits)
GIT version: unknown

EDIT: But I do need to ln -s /usr/lib64/libpcap.so.1 /usr/lib64/libpcap.so. But that's another issue I guess ( #479 )