netbox-community / netbox

The premier source of truth powering network automation. Open source under Apache 2. Public demo: https://demo.netbox.dev
http://netboxlabs.com/oss/netbox/
Apache License 2.0
15.68k stars 2.53k forks source link

Automatically create direct assignments of IP addresses to their parent prefixes #7845

Open jeremystretch opened 2 years ago

jeremystretch commented 2 years ago

NetBox version

v3.0.10

Feature type

Change to existing functionality

Proposed functionality

Currently, NetBox determines the relationships between prefixes and IP addresses dynamically, leveraging PostgreSQL query logic to find individual IPs which exist "within" each prefix (filtered by VRF): there is no direct relationship in the database from an IP address to its parent prefix.

This issue proposes effecting a direct ForeignKey field named prefix on the IPAddress model relating to the Prefix model. This will be for internal use only, and not editable via any form or API. The relationship will be set automatically whenever a prefix or IP address is created, modified, or deleted. (Django signals will be leveraged to detect these events.) It should be possible to perform these updates via a single bulk query for each object being modified.

For example, when creating the prefix 192.0.2.0/24 in VRF A, a bulk query will automatically update the prefix field of any IP addresses within this prefix and VRF to reference the new prefix. Similarly, when a new IP address is created, NetBox will assign it to the closest matching prefix.

It may make sense to extend this concept to the Prefix model as well, automatically defining the parent prefix for each child prefix.

(This proposal is related to #7451 but more involved. If this proposal is accepted, #7451 will likely be closed.)

Use case

The introduction of a ForeignKey relationship from IPAddress to Prefix will make querying child IPs and calculating prefix utilization much more efficient.

Database changes

Add a prefix ForeignKey field to the IPAddress model relating to Prefix.

External dependencies

No response

IdRatherStand commented 2 years ago

Would this allow us to reference an IP's prefix (and related prefix values) within the IP Address permissions constraints? As an example at the moment we are unable to grant IP Permissions based on the parent prefix, an example of the constraint we would look to use (on IPAM | IP Address) could be:

{"prefixcustom_field_datafieldA":"XXX"}

At present we can't find a way to do this so if the answer is yes a big thumbs up from me.

DanSheps commented 2 years ago

It most likely would, yes

jeremystretch commented 2 years ago

Yes, but you'd have to be extremely careful about it. For example, if you create 192.168.0.0/16 and 192.168.0.0/24, the former will be assigned as the parent of the later. However, if someone later creates 192.168.0.0/20, it will become the new parent for 192.168.0.0/24.

At present we can't find a way to do this so if the answer is yes a big thumbs up from me.

There are various net_* lookups that you can use to query by parent prefix. It's off-topic for this FR, however I invite you to start a separate discussion with the details of what you're trying to do.

jeremystretch commented 2 years ago

The introduction of a ForeignKey relationship from IPAddress to Prefix will make querying child IPs and calculating prefix utilization much more efficient.

After some experimentation with prefix parent assignment, I'm not sure this really buys us much. Querying child objects by foreign key relation doesn't seem to be significantly faster than querying by IP scope. I'll do some more testing, but it's looking like we might need to cache the utilization stats directly on the prefix objects instead.

DanSheps commented 2 years ago

I do think this allows us to tweak things in the future with regards to assignments and it reduces a bit of the logic burden. Maybe not as much for IP -> Prefix but for Prefix -> Prefix.

candlerb commented 2 years ago

I think that an explicit parent link from IP to prefix, and from prefix to parent prefix, would be a very hard invariant to maintain. There are a whole bunch of cases to consider:

(It's somewhat simpler if you're only considering IP addresses linked to prefixes, not prefixes to prefixes)

A large number of objects may need to be updated within the same transaction. However, what worries me more is races between overlapping transactions. Consider:

I am pretty sure that the IP address created in transaction A will end up with the wrong parent. I don't see a solution, short of doing global table locks for all such transactions. (A partial solution would be to forbid changing the address and VRF on any existing address or prefix object, forcing you to delete and recreate. But this doesn't solve the problem of a new prefix being inserted).

In general: my view is that if an invariant cannot be enforced in the database itself, then sooner or later it is going to break. People have observed this already (#6079)

DanSheps commented 2 years ago

We do have that link, it is just more of a soft link, and sometimes it breaks down when it comes to specific hierarchy (Global Prefix, parent, sub prefix in VRF doesn't display as it should on the prefix list when not filtered (should be listed as a child and is not).

This soes simplify things like permissions as you have that direct link now between prefix and IP or prefix and parent prefix. This makes plugin development, script/report development easier for the end user.

candlerb commented 2 years ago

Are we just alluding to limitations in the Django ORM? In principle, primary_ip__prefix__tenant: "foo" is a valid constraint, even if Django doesn't allow it (e.g. because it doesn't know how to turn it into SQL)

This first option which might come to mind is to add a property prefix on the IPAddress object, which returns the nearest enclosing prefix (with the longest prefix length). You'd be right that this is inefficient for certain uses, e.g. when you are building a table and you have to touch the database separately for each row. However, I think it would be possible instead to create a Postfix view with the correct join built in, in which case, the database would give the right answer every time, and the query optimiser would sort out the rest. (Also, the number of prefixes is relatively low and likely to remain cached).

Another point to raise is that Netbox currently duplicates the information about the prefix on each IP address object anyway. An "IP address" in Netbox isn't 192.168.1.1, it's 192.168.1.1/24. Using that information, you can calculate the enclosing prefix (e.g. 192.168.1.0/24), so there's no need for a foreign key reference anyway.

Personally, I hate this duplication of information. Forcing people to enter a prefix length means that instead of "1.2.3.4" they'll always enter "1.2.3.4/24"; if the actual enclosing prefix is a /26 then this information is wrong. So I have to run a report periodically to highlight all the errors, and then fix them. But this is a battle I lost long with Jeremy a long time ago :-)

DanSheps commented 2 years ago

Are we just alluding to limitations in the Django ORM? In principle, primary_ip__prefix__tenant: "foo" is a valid constraint, even if Django doesn't allow it (e.g. because it doesn't know how to turn it into SQL)

If it can't SQL it, you can't use it in places like permissions. Just because of the way the object based permissions are evaluated.

This first option which might come to mind is to add a property prefix on the IPAddress object, which returns the nearest enclosing prefix (with the longest prefix length). You'd be right that this is inefficient for certain uses, e.g. when you are building a table and you have to touch the database separately for each row. However, I think it would be possible instead to create a Postfix view with the correct join built in, in which case, the database would give the right answer every time, and the query optimiser would sort out the rest. (Also, the number of prefixes is relatively low and likely to remain cached).

Again, properties don't allow for use in object based permissions as one problem with this. I don't know if the ORM can consume views to use as part of a model.

Another point to raise is that Netbox currently duplicates the information about the prefix on each IP address object anyway. An "IP address" in Netbox isn't 192.168.1.1, it's 192.168.1.1/24. Using that information, you can calculate the enclosing prefix (e.g. 192.168.1.0/24), so there's no need for a foreign key reference anyway.

Parent prefix doesn't always match the address prefix length.

Personally, I hate this duplication of information. Forcing people to enter a prefix length means that instead of "1.2.3.4" they'll always enter "1.2.3.4/24"; if the actual enclosing prefix is a /26 then this information is wrong. So I have to run a report periodically to highlight all the errors, and then fix them. But this is a battle I lost long with Jeremy a long time ago :-)

No argument about duplication

candlerb commented 2 years ago

FWIW, there is a radically different way to link IP addresses to prefixes:

e.g. if the prefix linked to is 192.168.0.0/23, and the offset is 258, the IP address is 192.168.1.2/23 (which isn't stored anywhere)

This model is nice if you are frequently renumbering networks.

Searching for an IP address becomes a little trickier internally, but not massively so. If you want to search for 192.168.1.50 then you find all prefixes which contain 192.168.1.50 (which Postgres can do efficiently) then search for the relevant (prefixID, offset) pairs.

Downside: every IP address must be associated with a prefix. A random individual IP address needs either a standalone /32 or /128 prefix, or to be recorded as a huge offset from 0.0.0.0/0 or ::

Anyway, I don't expect it to be done, but thought I'd mention it as an example of how the explicit link from IP to Prefix can be useful.

jeremystretch commented 2 years ago

Bumping this from v3.2 as I don't have the resources to work on this right now.

Domoninic commented 2 years ago

Would this allow the case in plugin forms you could have :

    vrf = DynamicModelChoiceField(
        queryset=VRF.objects.all(),
        null_option="Global",
        required=False,
    )
    prefix = DynamicModelChoiceField(
        queryset=Prefix.objects.all(),
        query_params={"vrf_id": "$vrf"},
    )
    selecetd_ip = DynamicModelChoiceField(
        queryset=IPAddress.objects.all(),
        query_params={"prefix_id": "$prefix"},
    )

Or in scripts you could have:

    vrf = ObjectVar(
        model=VRF,
        null_option="Global",
        required=False,
    )
    prefix = ObjectVar(
        model=Prefix,
        query_params={"vrf_id": "$vrf"},
    )
    selecetd_ip = ObjectVar(
        model=IPAddress,
        query_params={"prefix_id": "$prefix"},
    )

So the selected-ip field is restricted to you selected prefix in the same way the prefix list is restricted to the selected VRF ?

Two cases where I think this would be welcome as there seems to be no straight forward way to do this at present.

DanSheps commented 1 year ago

@jeremystretch I had an idea awhile back.

While the "automagic" management of the prefix assignment might be useful, I think it might be more beneficial to instead allow manual management of the prefix assignment. I also think we should be including prefix<>prefix in this FR as well.

This would eventually allow deprecation of the "VRF" field on the IP address model itself.

Pros:

Cons:

I would be willing to undertake this, even in 3.4.

candlerb commented 1 year ago

This would eventually allow deprecation of the "VRF" field on the IP address model itself.

I'd be very interested to see specifically what you have in mind, in terms of what the final models would look like.

I'd suggest that if an IP address gets its VRF membership from its enclosing parent prefix, then it should also get its prefix length from it as well. That is, an "IP address" becomes 1.2.3.4 instead of 1.2.3.4/24, and is contained by prefix 1.2.3.0/24. However I've argued for that in the past, and it was shot down.

elipsion commented 1 year ago

I'd suggest that if an IP address gets its VRF membership from its enclosing parent prefix, then it should also get its prefix length from it as well. That is, an "IP address" becomes 1.2.3.4 instead of 1.2.3.4/24, and is contained by prefix 1.2.3.0/24.

This would break if you use sub-prefixes to group addresses together (rather than ranges). For example: My routable network is 192.168.0.0/24 (gateway at .1, broadcast at .255). I then split the network into /28 ranges to group hosts together for whatever reason (perhaps firewall rules?). The addresses "should" then be /24, even though their smallest containing prefix is /28.

elipsion commented 1 year ago

While the "automagic" management of the prefix assignment might be useful, I think it might be more beneficial to instead allow manual management of the prefix assignment.

This presents a quite divisive design decision to be made:

  1. Allow the user to set a parent prefix manually and calculate parent if they don't.
  2. Require the user to set a parent prefix and leave the parent as null if they don't.
  3. Store two fields, one calculated and the other user specified

Choosing 1 will require also storing if the parent was specified manually, and then referencing that bit in all database cleaning calculations, such as removing, adding, and resizing a prefix. It will also require some careful UI design to make sure it's clear when a parent is defined vs. calculated. Choosing 2 is going to require a lot of work from the user to keep their data in a usable state. Choosing 3 will put a lot of cognitive burden on someone who is writing queries against the data model, making them always have to consider if they want to prefer the calculated or the assigned prefix, or if they want to prefer the assigned and look at the calculated as a backup. Recursive queries will be even messier.

Data validation is also going to be an absolute dog. What happens if a user wants to move prefixes between VRFs? Should all child data tag along? Should we put the database in an inconsistent state if the user wants to move some of the children over, but not all of them? Do we clear the parent data for the children and have the user manually assign parents again if they wanted it to move also?

DanSheps commented 1 year ago

Just want to chime in here now that this is on the milestone:

Here is what I would envision as a data model for this:

class Prefix: parent = ForeignKey(Prefix) vrf = ForeignKey(VRF) prefixes = ForeignKeyRelation(Prefix(parent)) # Relation to the "parent" attribute on this class ip_addresses = ForeignKeyRelation(IPAddress(prefix)) # Relation to the "prefix" attribute on the IPAddress class

class IPAddress: prefix = ForeignKey(Prefix) vrf # Deleted as Prefix will carry the VRF going forward. Allows for a much "harder" link between the two.

Thoughts?

As far as changing IPAddress numbers/Prefix numbers, we have a couple of options:

For when Prefix is changed, we would have a couple of options options:

candlerb commented 1 year ago

Here is what I would envision as a data model for this

This looks clean and simple to me.

A couple of questions:

  1. Is the actual address stored inside class IPAddress going to change to a single address (e.g. 1.2.3.4) instead of address plus prefix length as today (e.g. 1.2.3.4/24)? I really hope so. It has been a big pain point in Netbox having to duplicate this information on every IP address.
  2. Will IPAddress.prefix be nullable? I would find it useful to have a "floating" IP address which does not belong to a prefix: for example, a public IP address hosted at a cloud service provider, like an EC2 public IP, where the prefix length or netmask is irrelevant or unknowable, and the address is outside of the prefixes in your own network. Letting this be null would be like setting the parent to 0.0.0.0/0 or ::/0. I wouldn't want to have to create a separate /32 prefix for every cloud IP address to have as its parent.

A similar point applies to Prefixes too. Clearly there must be one or more "root" prefixes: either they have a null parent, or you could have explicit objects for 0.0.0.0/0 and ::/0 which point to themselves. (EDIT: each VRF will form a separate tree, so it will be much easier to allow null parent for the that VRF's root prefix, than to have to create separate 0.0.0.0/0 or ::/0 for every VRF)

As far as changing IPAddress numbers/Prefix numbers, we have a couple of options

There is another case to consider, which is when a new prefix is inserted between existing prefix and its IP addresses: e.g. say 192.0.2.65 is linked to 192.0.2.0/24, and then prefix 192.0.2.64/26 is created; the addresses within the range 192.0.2.64/26 (but not already within a more specific prefix) have to be found and reparented. Then there is the mirror situation where a prefix is removed, and the child IP addresses have to be re-linked to the nearest enclosing prefix. I don't think this is a fundamental problem: even if it makes inserting and deleting prefixes expensive, this is a relatively uncommon operation. Making it race-safe without locking the whole ipaddresses table is a bit harder though.

Changing an IP Address should be straightforward (the model searches for the new parent), although the point I raised before about transaction safety and races still applies. There might need to be a manage.py fixup operation to clean up any broken parent relationships between IPAddress and Prefix, and between Prefix and Prefix.

I like the move of VRF into Prefix.

There is an orthogonal discussion around where there should be a distinction between "VRFs" and "network namespaces": that is, a network namespace is a zone of uniqueness for IP addresses, which can contain multiple VRFs (representing RDs and routing tables). It's not directly related to this discussion, but I only mention it because if there's going to be a fundamental rework of the IPAddress/Prefix/VRF model, then maybe it would be a good time to do that as well.

elipsion commented 1 year ago

There is actually another way to deal with the case of prefix lengths; store two parents.

candlerb commented 1 year ago

@elipsion: I presume you are referring to Netbox's current concept of "address" which is really "an IP address plus a prefix length", is that correct? For example, are you saying IPAddress "192.0.2.65/24" relates to both Prefix "192.0.2.64/26" and Prefix "192.0.2.0/24" (if it exists)?

I don't see the value in this complexity. Why not just have "192.0.2.65" as the IP address, and "192.0.2.64/26" as the parent ("direct parent" as you call it)? What use cases would this fail to support? Are there any examples of other IPAM software that stores a different prefix length along with each IP address?

The only use cases I can think of are these:

  1. A loopback address which comes from a container prefix (i.e. a block reserved for loopbacks). But you already know it's a loopback (either by its role, or by the interface it's configured on), so you know the mask must be /32 or /128.
  2. A container prefix which is being used inside a real prefix to allocate space for certain uses - e.g. inside the prefix 192.0.2.0/24, we reserve 192.0.2.64/26 for printers. I think "Ranges" are a much better fit for that use case: Prefixes are less flexible because they're forced to be on power-of-two boundaries. But if still you want to use Prefixes in this way, you could simply make the parent of an IP address be its enclosing "real" prefix, ignoring any "container" prefixes. (The UI for Ranges definitely needs to be improved, but that's a separate issue).

I've heard it argued before that not including the mask length in the IP address is bad for query efficiency. But if there's now going to be a direct foreign key relationship between IP address and Prefix, then you can easily do a "select_related" join to pick up the prefix. If VRF is moving to Prefix, then that join is going to be necessary anyway.

sdktr commented 1 year ago

If prefix relations are to be stored instead of calculated-on-read, I'd like to link #6825 to this for reevaluation (no more performance constraints for read actions).

kkthxbye-code commented 1 year ago

I'm not sure how this is going to work in environments with a lot of IP's. Deleting or resizing a prefix which contains 10k IP's would generate 10k ipaddress changlog entries (which are very expensive currently because of serialization) and I think would also trigger search indexing for each object. That would take several minutes probably. It can't really be delegated to the background either and race conditions are very likely, so it would almost be necessary to lock the entire ipaddress table, which would need to be gracefully handled in the code too.

jeremystretch commented 1 year ago

@kkthxbye-code I don't intend to expose the parent relationship explicitly; it would be tracked using internal fields (similar to what we do with Prefix._depth currently), so change logging isn't a concern.

trrunde commented 1 year ago

will this change make the function "next available prefix" make netbox work as it used to do when it comes to prefixes in vrf`s ? We have a container prefix with no vrf assigned, and from this we are using the api to get next available prefix that we use for linknets to many different vrfs. The child prefix will have vrf assigned to them, before this change this was working perfectly https://github.com/netbox-community/netbox/issues/10109

But after that issue where closed then next available prefix function is giving us the same prefix every time. Hoping that the old behaviour will come back if the prefix have a foreign key to parent prefix.

kkthxbye-code commented 1 year ago

@trrunde - Not seeing how that would have anything to do with this issue no. If you are experiencing what you believe is a bug you should create an issue.

ryanmerolle commented 1 year ago

@jeremystretch I think we can push to 3.6 if you agree

jantari commented 1 year ago

Yes, but you'd have to be extremely careful about it. For example, if you create 192.168.0.0/16 and 192.168.0.0/24, the former will be assigned as the parent of the later. However, if someone later creates 192.168.0.0/20, it will become the new parent for 192.168.0.0/24.

At present we can't find a way to do this so if the answer is yes a big thumbs up from me.

There are various net_* lookups that you can use to query by parent prefix. It's off-topic for this FR, however I invite you to start a separate discussion with the details of what you're trying to do.

We would need the ability to set permission constraints like {"_parent_prefix__tenant__slug":"something"}.

In that case any IP Address that is in-scope for a parent prefix with that tenant should be editable/viewable for the users with the permission assigned. Even if there is a sub-prefix defined in between, e.g.:

Prefixes:
192.168.0.0/16 (no tenant set)
\_ 192.168.200.0/24 (tenant: contoso)
    \_ 192.168.200.128/25 (no tenant set)

User "allen" with view/edit/change/delete permissions on IPAddress objects with {"_parent_prefix__tenant__slug":"contoso"} should now be able to create / manage IP Address objects inside 192.168.200.128/25 imo. This usecase oughta be very common. Thoughts?

elipsion commented 1 year ago

Yes, but you'd have to be extremely careful about it. For example, if you create 192.168.0.0/16 and 192.168.0.0/24, the former will be assigned as the parent of the later. However, if someone later creates 192.168.0.0/20, it will become the new parent for 192.168.0.0/24.

At present we can't find a way to do this so if the answer is yes a big thumbs up from me.

There are various net_* lookups that you can use to query by parent prefix. It's off-topic for this FR, however I invite you to start a separate discussion with the details of what you're trying to do.

We would need the ability to set permission constraints like {"_parent_prefix__tenant__slug":"something"}.

In that case any IP Address that is in-scope for a parent prefix with that tenant should be editable/viewable for the users with the permission assigned. Even if there is a sub-prefix defined in between, e.g.:

Prefixes:
192.168.0.0/16 (no tenant set)
\_ 192.168.200.0/24 (tenant: contoso)
    \_ 192.168.200.128/25 (no tenant set)

User "allen" with view/edit/change/delete permissions on IPAddress objects with {"_parent_prefix__tenant__slug":"contoso"} should now be able to create / manage IP Address objects inside 192.168.200.128/25 imo. This usecase oughta be very common. Thoughts?

This sounds un-doable, due to how relational databases work. It also introduces lots of room for unnecessarily opinionated default functionality; how many levels up should be searched for tenants? Is shadowing a thing? How can the user detect if the parent prefix does not have a tenant set? And so on.

Your best bet is to leave all this to the one implementing an ACL themselves, which in your case would probably be a stack of ACLs checking if the parent prefix tenant is blank, and the one above that has the appropriate slug set.