mentalisttraceur / python-macaddress

BSD Zero Clause License
19 stars 5 forks source link

add nic (network interface card) property #26

Open danmar2000 opened 1 year ago

danmar2000 commented 1 year ago

class _StartsWithOUI contains oui property, I added nic property.

mentalisttraceur commented 1 year ago

Thanks for the contribution!

I like the idea of making this easy and better for the situations that need it, and it helps to have an initial design+implementation to get things rolling. Now I just gotta deeply think about the design/approach.

Do you mind sharing how you're using the NIC information, why you need to handle it separately, and how it helps you to have it as a distinctly-typed object? That would help me think about it better.


In the meantime, here are some initial thoughts/reactions (you don't have to do any code changes based on these thoughts yet - I might "change my mind" multiple times as we talk+think it over):

  1. How do we expect this to work for any address size other than 48-bit EUI/MAC?

    Since you have size = 24 in the NIC class, what happens with the current implementation if you do EUI64('01-23-45-67-89-ab-cd-ef').nic or CDI32('01-23-45-67').nic? Off the top of my head, it would be an error in the first case since 0x6789ABCDEF is too large for a 24-bit NIC, and it would produce the misleadingly/wrongly -sized NIC('00-00-67') in the second case.

    My current intuition is that each concrete HWAddress subclass should have its own NIC class. The OUI is universally sized, but the identifier part varies by address (and besides having a different size, it's also semantically not interchangeable - two device identifiers are not the same type of thing if they came from two different address types). A single class which somehow knows the class/size of the full address it goes with would be fine if the use-case for actually type-checking them is low.

  2. I will probably change the name "NIC" to something more pedantically precise... Off the top of my head, I think "device identifier" or something. (I'm going to look up the terminology they use in the formal standards, and probably base the name on that.)

  3. This has some big-picture abstract parallels with #8 (there, a lot of my thoughts were basically treating these address classes as fixed-size integers which were unconventionally MSB-aligned - here, the device-identifier is basically a more typical LSB-aligned fixed-width integer) and with #7 (in that case, I keep thinking about an address range class which also needs to know the size/class of the full address it corresponds to).

danmar2000 commented 1 year ago

Thank you for your feedback!

I'm currently working on a network analyzer, and one of its key features is to analyze the devices involved in the captured data. When creating a Device object, I have a particular interest in the manufacturer information, which I can easily obtain from the OUI. Additionally, I also require the device's identifier. While the complete physical address could serve as a unique identifier, it feels redundant to use it when the OUI has already been extracted

  1. Yes, I made a mistake, this implementation won't be suitable for physical addresses that don't consist of 48 bits. I think that implementing a NIC per each HWAddress is a good idea. In this way, the property nic can be generic.

  2. I agree, "nic" may not be the most appropriate name. After reviewing the EUI48/EUI64 and CDI32/CDI40 standards, I noticed they use the term "EI" (Extension Identifier). Perhaps using "EI" as the name would be more suitable in this context.

  3. I find this really interesting and would love to dive into the code and read your comments. If you're open to the idea, I would also like to add some suggestions and provide feedback on your own suggestion.

mentalisttraceur commented 1 year ago

Suggestions and feedback are totally welcome! That's a big part of why I put those thoughts in public GitHub issues.

Thanks for the description of your use case! For whatever it's worth -and I'm not saying this to dismiss adding this feature, just sharing design thoughts- in that situation I would usually just use the MAC address until it became a problem, especially in Python, although it might depend on the details. [Click to expand if you want to read why.] I agree that it seems wasteful/redundant, in the sense that it's repeating information. And in something like C, if I had a struct which needed to store the whole MAC address but needed the three bytes of OUI as a separate member, I would think similarly, and consider a separate field for just the extension identifier bytes - this is assuming of course that it really helped clarity or performance to have those bytes pre-separated... in most situations I'd just store all the MAC address bytes in one integer and do the one bit shift or mask when I needed them separate (and that shift/mask has good odds of being free depending on what the CPU instructions support, how the CPU is pipelined, and what optimizations the compiler can figure out). In Python, things are a bit different. I want to discourage focusing on performance first, but it's worth noting that the MAC address object is already allocated and initialized. Assigning it as an attribute just copies the pointer/address/reference and increments a reference counter. So `my_device.identifier = the_mac_address` is strictly less overhead than `my_device.identifier = the_mac_address.nic`, since the latter requires another object to be allocated and initialized at least (and each Python object is... small, but each one takes up a few integers' worth of memory). But steering away from performance and to semantics and generality - I kinda like things that remain correct in more situations, and if the device identifier is the full MAC address, then that's one less thing that might need to change if the code ever evolves in a way that causes devices with different OUIs to cross paths. And in that bigger-picture sense it's not redundancy, it's adding value: resilience against code churn and evolving requirements, reusability in more situations.
Re: "extension identifier" - yeah that's what I found too, and it seems like the right direction. I don't want to shorten it all the way to `.ei` though, because it's not really widespread for people to say or write "EI" the way it is with "OUI" - a person with only a passing exposure to the problem space can *recognize* what `.oui` means, but if I had seen `.ei` in code prior to this discussion I would have to *think* or look up stuff to figure out what it means. I think if I had to pick a name right now, I'd go with `.identifier`. [Click to expand you want to read why.] There's a part of me that wants to err hard on the side of self-describing unabbreviated naming: `.extension_identifier`. But I do understand that would be annoyingly long and verbose in some contexts, and it's inconsistent with `.oui` which would trip people up sometimes. One thought I'm having is supporting both super short and fully verbose names - so both `.extension_identifier` and `.ei` would be valid. For `.oui` I could really lean into it and add `.organizationally_unique_identifier` as an alias. In Python that's super easy because once we have a function named `extension_identifier` decorated with `@property`, we can just do `ei = extension_identifier` in the class definition - there's an analog in CLI programs' long and short options - the short version is good for familiar users quickly writing and trying things (for example in the Python REPL, one-off scripts, etc), the long version is often better for people who have to read and maintain the code. But I'd rather find a solution that doesn't want for both a long and a short name - in particular, I notice that this becomes a non-issue if we dropped the attribute-access way in favor of the ideas I had been entertaining in #8 . Another thought I'm having is that unless I had read the standard terminology, I would not think "extension" means "the rest/most of the address - everything that's not part of the OUI". When I see extension I think, well, *extension* - some extra, an addition, auxillary function, etc. I can sorta bend my mind to make "extension" make sense here, but it feels awkward. So in that sense, `.identifier` seems better. I also was really into naming it `.non_oui` or `.not_oui` for a few minutes, but I think it's a little too ambiguous for optimal self-explanation. Names like that are sometimes booleans describing the object they're on, for example.
mentalisttraceur commented 1 year ago

This is still live in my head, I just haven't gotten around to giving it much thought.

@danmar2000 You mentioned having thoughts on the linked issue/discussion? I'd still love to hear those.

mentalisttraceur commented 1 year ago

Thinking about it again, .ei is probably fine.

I think I was giving too much weight to "I don't know what this is and I'm now frustrated/slowed!" abbreviation downsides. It's probably worth the benefits of consistency (.oui and .ei make sense together) and so on.