constellation-app / constellation

A graph-focused data visualisation and interactive analysis application.
https://constellation-app.com
Apache License 2.0
385 stars 59 forks source link

Support importing PCAP files #747

Closed forestfairy closed 3 years ago

forestfairy commented 4 years ago

Prerequisites

Description

v2.0.0-rc2 Unable to import pcap files directly from wireshark

Steps to Reproduce

  1. Capture files in wireshark and save
  2. Use File>Import to attempt to upload .pcap files
  3. No option in File>Import
  4. If attempt is made through the >File>Open then it is not read correctly

Expected behaviour: [What you expect to happen] File to open much the same as a .csv would

Actual behaviour: [What actually happens] If an attempt is made ot open te file through the File>Open path the follwoing is displayed:

image

If the user selects proceed, the following is displayed:

image

Reproduces how often: [What percentage of the time does it reproduce?] 100%

Additional Information

Any additional information, configuration or data that might be necessary to reproduce the issue.

arcturus2 commented 4 years ago

Thanks for your interest @forestfairy. Presently Constellation does not have a parser for importing PCAP files though this sounds like an interesting idea and worth exploring further. I noticed you created #748 and #749 which looked to be related to PCAP files too. Are you able to provide an example of what you expect the graph to look like after importing a PCAP file?

This website has sample pcap files for download. If you could pick a few and perhaps draw a graph (via paint, gimp or Constellation) of what you want the plugin to generate for you, that would be great so we can get a sense of what you want to see.

Or would the ability to bring up the Delimited File Imported be more appropriate?

forestfairy commented 4 years ago

Hey @arcturus2 , I think you may overestimate my drawing skills, I'm not sure how different it would look to the capability that Constellation already has with .csv files and I envision something very similar to that. Its just the usability of being able to pull those files straight in. Whe I was doing this, initially, I was looking at analysing the data looking at things like:

I mean this is extra and above for Constellation given that you can already export the WS files as .csv . Its just that luxury of making it friendly with other tools. Having something like the delimited file import (similar to the tool in xl) would be an extra luxury as it would give you the opportunity to do some data cleansing on that last column in those files, as you import, to extract the information there and then go on to mapping it in your analysis.

The files in #748 and #749 were originally .pcap but were exported as .csv . I left in that information as I wondered if it may have some bearing on the error...

serpens24 commented 4 years ago

Hi @forestfairy - I've spoken to @arcturus2 and I'm going to start looking at adding this feature. I'm going to do a bit of background research on PCAP parsing first, but may be in touch in the near future to get a feel for your usecases as they may influence my design.

serpens24 commented 4 years ago

Attached is a sample (default) Wireshark view of some sample dsata found in the link provided by @arcturus2.

image

An immediate use of this data that I can view would be setting to/from as to/from nodes, and using timestamps/Prorotcol/length/packet content to characterise transactions between these IPs over time. I'd suggest this leads to a possible wider COnstellation enhancement that allowed default column mappings when importing data (probably through the 'delimited File Importer''

And below is a sample of some of this network traffic imported into Constellation (I converted to CSV as intermediate step). This data hasnt had a timestamp applied, but if it did, it would most likely be pretty easy to replay traffic between IPs sequentially to get a feel for when exchanges occured, and what prompted them. image

serpens24 commented 4 years ago

Have started initially, by prototyping the pkts library (https://github.com/aboutsip/pkts) which produced the following results with the first sample dataset I downloaded - fuzz-2006-06-26-2594.pcap: image

However I have found some other sameple files throwing exceptions, which seems to relate potentially to packets not identified as TCP or UDP in Wireshark .. although I need to confirm this.

There are other libraries worth trying, such as https://sourceforge.net/projects/jnetpcap/ which I intend to try before moving on to diagnose any pkts errors.

https://stackoverflow.com/questions/26978618/java-pcap-file-parser-library describes a few of the packages.

I tried including jnetpcap - it doesn't reside in either of the current configured repos (central.maven.org, or repo.osgeo.org, but in https://clojars.org/repo/). I tried configuring this repo in ivysettings.xml and ivy.xml, but I get errors indicating the required package "" cant be found. I'm unsure why as manually entering the url indicted in the netbeans install log does resolve to valid files.

serpens24 commented 4 years ago

Some examples of errors thrown in pkts library are shown below:

serpens24 commented 4 years ago

Noter to self: underlying io.pkys code occasionally prints stack traces of included code ... ie: SipInitialLine.java:101 contains "e.printStackTrace();;"

I'm not sure we can supress this.

serpens24 commented 4 years ago

@arcturus2, @forestfairy - Branch https://github.com/constellation-app/constellation/tree/feature/issue747-PCAP-import has been pushed providing a high level proof of concept of PCAP import.

This is using the io.pkts library. It does basic parsing of PCAPs to extract src/dest ip/port/type, and identifies TCP/UDP protocol (but doesnt dig deeper than that at this point.... I suspect this requires soem bespoke logic).

I suspect if we are happy with the columns that are generated, that we may be able to modifiy the file importer to automatically assign columns to nodes/edges without the user needing to drag headings across.

This may be a useful feature in general @arcturus if we ever need to set up 'hard coded' imports from known file types.

I also noted the library was detecting PCAP frames/packets which it wadnt completely happy with - ie it was identifying suspected packet corruption in the sample data, maybe this is because the sample data is 15 years old... I intend to throw some new data at it.

@arcturus2 - if you want to have a look grab the branch and have a play. Only changes are to ivy.xml and the additino of a new PCAPImportFileParser.

NOTE: I dont know if constellation allows it, but I reckon an animation playing data "over time" ie running through timeline from start to finish and showing traffic would have some use somewhere.

serpens24 commented 4 years ago

A couple additional sample files to download and throw at constellation: https://s3.amazonaws.com/tcpreplay-pcap-files/smallFlows.pcap https://s3.amazonaws.com/tcpreplay-pcap-files/bigFlows.pcap

Attached is a screenshot of bigFlows.pcap containing 3410 nodes and a tad over half a million transactions. image

serpens24 commented 4 years ago

image

the io.pkts package isnt perfect at extracting all fields, As such some data can be extracted using raw byte checks. The above image documents the standard Ethernet II structure, showing the first 14 bytes contain Dest MAC address, Src MAC address, and ethernet type ... these can be extracted manually from the packet payload.

serpens24 commented 4 years ago

https://en.wikipedia.org/wiki/EtherType - gives a summary of some etherTypes.

arcturus2 commented 3 years ago

Again, thank you for this work @serpens24 and to also contributing it to the ACSC Cyber repo (https://github.com/AustralianCyberSecurityCentre/constellation_cyber_plugins/pull/4). I'll look to create a new version of Constellation and the Cyber version soon.