jeremyandrews / netgrasp

Passive network observation tool
Other
31 stars 1 forks source link

Perform local MAC lookups #8

Open jeremyandrews opened 7 years ago

jeremyandrews commented 7 years ago

Currently netgrasp makes a network query for each MAC it sees (api.macvendors.com), to match it with a vendor. I believe it would be preferable to download publicly available MAC-data and perform this lookup locally, allowing netgrasp to function quicker and without an Internet connection.

For example:

natej commented 7 years ago

Would you be ok with adding requests as a dependency so we could download your example link http://standards-oui.ieee.org/oui.txt and parse it into a table for local lookups?

I haven't checked their site to see if they're ok with downloading oui.txt. Or should we pull the data from api.macvendors.com instead? Maybe add a MAC "load" command?

jeremyandrews commented 7 years ago

Correct, the plan is:

  1. Provide a command allowing users to optionally download/parse oui.txt, at which point it would be used for all MAC lookups;
  2. Otherwise, fall back to the existing API that's currently being used
natej commented 7 years ago

So I looked at the ieee.org site and ran some google searches:

1) The ieee.org site's FAQ: http://standards.ieee.org/faqs/regauth.html#8

How can I obtain the names and assignments of those companies who have registered an assignment? Please see the public listings available on our web site to view public directories for each active registry.

Leads to: https://regauth.standards.ieee.org/standards-ra-web/pub/view.html#registries

Which has download links for "MAC Address Block Large (MA-L)", "MAC Address Block Medium (MA-M)" and "MAC Address Block Small (MA-S)" in CSV format.

So we'd need to download/combine those 3 files?

2) Wireshark has a public OUI lookup tool here: https://www.wireshark.org/tools/oui-lookup.html

Which has a link to their "Wireshark manufacturer database": https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob_plain;f=manuf

Reading the comments in that file leads me to think using the Wireshark manuf file would be more complete given its history and it being maintained by the Wireshark devs. It appears it would be easier to integrate too, notwithstanding any licensing issues. I don't know what the licensing restrictions would mean for us if we used their manuf file.

jeremyandrews commented 7 years ago

Interesting, indeed they started with the 3 ieee files, but have added other data as well. My only concern is licensing: that project is under the GPL, netgrasp is under the 2-clause BSD.

natej commented 7 years ago

FYI: I'm working on a patch for this.

jeremyandrews commented 7 years ago

Which source are you planning to use? As the file won't be committed to our code-base, I think it should be fine to use the Wireshark source (it'll only be optionally downloaded by people who choose to download it), so the license shouldn't affect the netgrasp codebase.

As for using requests, I'm curious why we'd need to? Why can't we just use the built-in httplib, rather than adding another external dependency? If you look in mac_lookup you can see where we're currently using httplib with good results.

natej commented 7 years ago

That's awesome feedback. There must be some telepathy going on. :)

That's exactly where I'm at right now and had the same thoughts. Since we're not distributing it, I was going to just pull the wireshark file and print a message about their project/license for attribution. Seems like the least we could do for all of their work. And this might be more proper from a legal perspective too.

I don't want to bring in any unnecessary dependencies with requests. I had found and looked at the exact code you mentioned and will re-use it. Provided their server isn't doing anything "special" with the headers/content, it shouldn't take any more time and be straight-forward.

I really appreciate your thoughts and guidance on this. It's a big help. I wasn't sure how you'd feel about using the wireshark file with regards to the licensing. Thanks for clearing that up.

natej commented 7 years ago

I wish I'd looked at the db tables more closely earlier. There's a state table I could've used to store metadata. Not that it's a huge deal, but I wanted the flexibility to write the last update timestamp along with the current config ("local" or "remote", the default is "remote") the user has chosen... and things I hadn't thought of yet. So I created a mac config file directive in the config file that by default points to ~/.netgrasp.mac. And I use that to store the current mac lookup config and metadata.

Another detour I took was maybe using the config file to store the mac config, but then we'd need write access and related api/machinery for that. And ConfigParser doesn't support preserving the comments from the default template config, at least directly. It would take some time to deal with that. Not really a good option.

Anyway, I don't like the fact that it's in its own separate mac config file. I know the db is the obvious place. But the code is now mostly done except for parsing/writing it to the db table. And I already have a lot of time in it since I refactored some related files and fixed some unrelated bugs. I keep running into "oh yeah, I'll need to do that" and I need a break. :)

Just thought I'd let you know where I was with things.

Oh yeah, we'll need to parse the mac prefix and take wildcard bits into account when matching. See the top of the wireshark data file for details.

jeremyandrews commented 7 years ago

I'm happy to review a PR at any point, even before you're ready for it to be committed. I'll point you toward existing patterns/helpers in the code where they exist already.

All configuration should live in the existing configuration file; I do not want to edit it from the daemon but a future @TODO is the ability to manually edit it and update configuration w/o restarting. It's a very low priority however, as restarting isn't harmful in any way.

As you've noted, state lives in the DB in the state table. There's a helper for storing timestamps.

Looking forward to seeing your code!

natej commented 7 years ago

Thanks for the feedback and guidance. There will probably be things that need to be added, e.g. mac matching.

I'll get it as close as I can and submit a PR.