peeringdb / peeringdb

Server code for https://www.peeringdb.com/
BSD 2-Clause "Simplified" License
364 stars 112 forks source link

Data Ownership TF recommendation - 1: netixlan dependency conflict resolution #676

Closed koalafil closed 3 years ago

koalafil commented 4 years ago

This is one of the two issues I am creating corresponding to the two main recommendations from Data Ownership Task Force as noted in the PeeringDB Data Ownership Policy Document.

TF requests that PeeringDB is modified in order to prevent operational disruption relating to handling of "ix", "ixlan", and "ixpfx" data elements.

Relevant sections of the policy document are: ...

6.1) netixlan

A conflict may arise in which the IP assignment data publicly exported by an Internet Exchange does not match data provided by a Network. Alternatively, an Internet Exchange may reach out to PeeringDB to dispute the IP assignment data provided by a Network.

Since this data can have an impact on Internet operations, this document specifies a principle of minimal disruption. This principle means that conflicted data which is already published shall not be taken down unless done so by the Network or by a resolution process mediated by the Admin Committee.

Similarly, new data from an IX-F Member Import or which is entered by a Network, which conflicts with existing data, shall not be published until a resolution process has been mediated by the Admin Committee and/or the conflict is resolved due to updated data from the Internet Exchange or the Network.

The Task Force recommends PeeringDB employ user interface methods and email notifications to encourage data harmony between a Network and an Internet Exchange, as a means of expediting resolution and decreasing the burdens on the Admin Committee.

It is understood that an IX-F Member Import may be incomplete, such as due to an information embargo requirement. If a conflict arises due to new data provided by a Network, the above conflict resolution recommendations are appropriate.

...

7.1) "netixlan" dependency on "ix", "ixlan", and "ixpfx"

In order to prevent operational disruption, this Task Force recommends that PeeringDB be modified to prevent deletion or updates of "ix", "ixlan", and "ixpfx" data elements from having a disruptive effect on dependent "netixlan" data elements, when data exists that would be disrupted. When needed, the removal of dependent data elements should be coordinated by the Admin Committee. ...

In other words, quoting @ccaputo, following is the recommended behaviour:

Also note the following as previous discussions:

https://github.com/peeringdb/peeringdb/issues/627 (closed as this issue is created) https://github.com/peeringdb/peeringdb/issues/585 (I've closed it but pls re-open if PC decides it is still needed) https://github.com/peeringdb/peeringdb/issues/539

koalafil commented 4 years ago

Coupled with https://github.com/peeringdb/peeringdb/issues/677

grizz commented 4 years ago

@koalafil @ccaputo What did the task force decide about #391? These issues seem to only deal with conflicting data.

ccaputo commented 4 years ago

@koalafil @ccaputo What did the task force decide about #391? These issues seem to only deal with conflicting data.

5.2.1) netixlan
[...]
• In order for "netixlan" data elements to be created by a Network or through automation
enabled by a Network, an Internet Exchange must first create related "ix", "ixlan", and
"ixpfx" data elements. Due to this dependency and a desire to prevent operational
disruption, this Task Force recommends in section 7 that "ix", "ixlan", and "ixpfx" data
elements not be able to be deleted or modified in such a way that would impact
dependent "netixlan" data elements.

• The "netixlan" data element includes a "net_id", which points to an "net" data element,
which includes an "org_id" with associated privileges. In addition, Networks may take
advantage of automated mechanisms that PeeringDB offers, which utilize data publicly
exported by Internet Exchanges.

Thus unless a network has 'Allow IXP Update' explicitly enabled, the Task Force Policy Document specifies that missing netixlan entries in the IX-F JSON export would not be created.

durkovic commented 4 years ago

@ccaputo Does it mean that the task force will do nothing about the problem decribed in #391? Is everyone happy with the fact, that typically 1/3 of entries is missing in PeeringDB and thus the overall information PeeringDB provides is unreliable?

arnoldnipper commented 4 years ago

@ccaputo Does it mean that the task force will do nothing about the problem decribed in #391? Is everyone happy with the fact, that typically 1/3 of entries is missing in PeeringDB and thus the overall information PeeringDB provides is unreliable?

One of the paradigms of PeeringDB is that owners decide whether to put in data or not. This principle was also acknowledged by the Data Ownership Policy.

PeeringDB data may not be complete. However, data is not unreliable. We put a lot of effort in continuously validating data. And of course, it's all about educating the community. And if we look at data from esp. larger and large IXP we see that well over 90% of the connections (netixlan) are registered with PeeringDB. From an operational point of view, this is fantastic. See the appendix for detailed information.

20200411_trustworthiness.txt

ccaputo commented 4 years ago

@ccaputo Does it mean that the task force will do nothing about the problem decribed in #391? Is everyone happy with the fact, that typically 1/3 of entries is missing in PeeringDB and thus the overall information PeeringDB provides is unreliable?

@durkovic: I sorted Arnold's data by the ratio column to produce 20200411_trustworthiness_sort.txt. A majority have well over 50% completion in PeeringDB.

Arnold is correct that "One of the paradigms of PeeringDB is that owners decide whether to put in data or not." and the Task Force affirmed this.

Arnold's table is fascinating and I wonder if it might make sense to keep that updated somewhere as part of some posted stats somewhere.

I took a look at the Seattle Internet Exchange (SIX) since I am very involved with that organization:

# ID    IXPDB   PDB     Ratio
13      331     324     0.979

I believe the high ratio of 97.9% PeeringDB participation is due to the following, and other IXPs may want to adopt it: 1) We encourage participants to update PeeringDB when they receive their IP assignments. 2) We require PeeringDB participation in order to be able to use the route servers, since it is from PeeringDB where the SIX derives max-prefix-count and as-set information which informs configuration.

arnoldnipper commented 4 years ago

Arnold's table is fascinating and I wonder if it might make sense to keep that updated somewhere as part of some posted stats somewhere.

Maybe we could ask Gianluca Mazzini to add this information to his tables. Currently, Gianluca only pulls information from PeeringDB. IXPDB also stores information about the IX-F JSON URL from an IXP publicly. If an IXP has that information IXPDB's participant_count is taken from that URL.

Of course, PeeringDB also stores the URL for an IX-F JSON import. But not publicly visible. However, taking net_count_ixf from it and perhaps even adding to the API would make sense. Compiling stats from that would be a no-brainer then.

durkovic commented 4 years ago

And, Arnold's table clearly shows the missed opportunity. It's evident, that PeeringDB:

Most of that could be fixed by decent processing of IX-F exports (with manual conflict resolution).

Problem described in #391 won't disappear, since many networks simply don't care.

So the main question is: do we really want two separate, independent sources of peering data (PeeringDB from networks, IXPDB from IXPs) which will never synchronize - and thus everyone will need to consider both and perform his own housekeeping? Or would it be possible to modify the old paradigms and find a solution how to decently merge both sources?

arnoldnipper commented 4 years ago
  • contains entries, that are no loger valid (ratios like 1.042) ... Most of that could be fixed by decent processing of IX-F exports (with manual conflict resolution).

That's what we intend to do with a completely new algorithm on how connections to an IX are added. The more IXP provide IX-F JSONS the better the data gets. And with the help of the IXP, networks will be motivated to join PeeringDB. I still like PeeringDB's approach of user-maintained data. Interested networks will keep their data up to date at a level which no other could achieve. There is way more in the network's data than just their connection to IXPs.

ccaputo commented 4 years ago

And, Arnold's table clearly shows the missed opportunity. It's evident, that PeeringDB:

  • lacks many entries that are present in IX-F exports
  • contains entries, that are no loger valid (ratios like 1.042)

Most of that could be fixed by decent processing of IX-F exports (with manual conflict resolution).

Problem described in #391 won't disappear, since many networks simply don't care.

So the main question is: do we really want two separate, independent sources of peering data (PeeringDB from networks, IXPDB from IXPs) which will never synchronize - and thus everyone will need to consider both and perform his own housekeeping? Or would it be possible to modify the old paradigms and find a solution how to decently merge both sources?

@durkovic: I have created https://github.com/peeringdb/peeringdb/issues/681 to address this, but for the networks that simply don't care, forcing data onto their PeeringDB record without their permission is not a solution.

durkovic commented 4 years ago

@durkovic: ... but for the networks that simply don't care, forcing data onto their PeeringDB record without their permission is not a solution.

What's the exact problem here?

So what prevents PeeringDB to join the club, pull the IXP export and present the data (maybe noting the source is IXP xy)? And if network doesn't want to be listed, it can join PeeringDB and override (remove) the records asap.

ccaputo commented 4 years ago

@durkovic: ... but for the networks that simply don't care, forcing data onto their PeeringDB record without their permission is not a solution.

What's the exact problem here?

  • network has signed a contract with IXP and agreed it's presence will be published on IXP website (and in IXP's json export)
  • there are already several websites pulling all IXP exports and providing summarized data views

So what prevents PeeringDB to join the club, pull the IXP export and present the data (maybe noting the source is IXP xy)? And if network doesn't want to be listed, it can join PeeringDB and override (remove) the records asap.

As far as I understand things, PeeringDB has always been opt-in and that has contributed to it becoming a trusted resource. Changing to opt-out would violate that trust.

It is not for us to know why a Network chooses to not be listed at an IX, but to respect that choice. They could be departing. The IX could be in error, either intentionally or unintentionally. Etc.

grizz commented 4 years ago

Thus unless a network has 'Allow IXP Update' explicitly enabled, the Task Force Policy Document specifies that missing netixlan entries in the IX-F JSON export would not be created.

@ccaputo oh right, sorry. We'll spec and make issues from the full document.

@peeringdb/pc @koalafil We can probably close these two issues.

arnoldnipper commented 4 years ago

@peeringdb/pc @koalafil We can probably close these two issues.

@grizz, I'm about to create to new issues referencing this issue and #677 which are more specific. One that deals with all aspects of the creation, altering and deletion of a netixlan object. And the second one which deals with locking ix, fac, and ixpfx for deletion as long as there are other objects referring to them. So let's keep both issues open until we are done. Does that make sense, @peeringdb/pc?

grizz commented 4 years ago

Sounds good, makes sense to me, thanks!

mcmanuss8 commented 4 years ago

Will be implemented in #697