publicsuffix / list

The Public Suffix List
https://publicsuffix.org/
Mozilla Public License 2.0
2.02k stars 1.21k forks source link

Automation of PRIVATE section removals #1119

Open dnsguru opened 3 years ago

dnsguru commented 3 years ago

This project has evolved a fairly strong ingress process for adding entries.

Once folks get their entry/ies added, it is typically a set and forget activity until they want to amend or modify them.

Few if any submitters come back to remove their entries once set or functioning.

We have put in place a requirement on new submissions to the PRIVATE section that 2+ years of service before expiry are required, in anticipation of some form of culling automation using the expiry date being a data point used to determine the domain is still 'on the air'. Another data point might be the ongoing presence of a _psl txt record where possible.

Without being prescriptive of the sensors and triggers for removals, there seems a need to have some automated deletion process that can scan and use some agreed logic for parsing and removing entries.

Long version: The PSL gets adds and updates, but typically few or no deletions. One exception is the ICANN contracting json update automation for adding and removing TLDs in the ICANN section of the file, but an area that is growing in size which would benefit from attention is the PRIVATE section.

The file size has grown, and continues to grow while the PSL is leveraged as a fast-fail on inclusion or other derivative use cases that seem to require a PSL entry to get some special handling or status.

The volunteers suspect that there are numerous domains that are in the list which have not been renewed or have expired and been re-registered/drop-caught or have been acquired in the secondary market prior to their expiry by third parties.

This project is staffed by volunteers, all of whom are not looking to have more volunteer time extracted to deal with more than the current process of adds/modifies, which has been greatly aided by automation with the test scripts and the root zone deltas.

Would like to discuss.

Is there consensus that we need deletion automation within the PRIVATE section, and:

dnsguru commented 3 years ago

So far, there are a number of ccTLDs that do not present the expiration date in automated lookups.

Whatever automation might have to look to those domains differently for validation of existence and frequency of flagging/purges

dnsguru commented 3 years ago

I am going to run a manual sweep on the private section to determine expired domains for removal using RDAP response expiry dates on gTLDs, and will report the results in aggregate here.

fsodic commented 3 years ago

@dnsguru Is this a discussion? if so, there may be better solutions for forming a healthier "private section". Like looking for tools that can "dig" _psl automatically for domains in this private section.

dnsguru commented 3 years ago

Fajar it is absolutely a discussion about how to get this done - and appreciate your input!

Lookups for the presence of a _psl txt record that persists could be a strong way to do this review faster...

Because the existence or non-existence of the domain in the registry does not take into consideration that it is still registered by the same submitting party or even registrant, or is even connected in DNS in the same manner or intent that was part of the PR or rationale for inclusion into the PSL.

Doing this in DNS would work well, actually. Most folks seem to move on to whatever else they were doing after requesting inclusion, and the PSL entry becomes 'set and forget'. Because of the validation step where a _psl txt record gets added in order to qualify an entry, one could reckon that just leaving that in place is simpler, as it takes another step to remove the entry from DNS. Then, we leave the entry while the _psl txt record remains.

Interesting to consider.

fsodic commented 3 years ago

It might be a problem when the domain does not have a psl for reasons that a dns error coincides with it and dig might detect the absence of the dns as a sign the domain is inactive. So a better solution is to add 2 or 3 files that hold dns data from an existing domain separated into 3 different levels. Orange or Yellow: for a light warning flag Red: for intermediate deletion warnings Black: for the last warning and the last crawl is permanent deletion.

and an additional solution is with the psl which is scheduled for 1 week so that there is a 1 month chance to fix their dns psl.

Another part is when the domain falls into each category there may be a reminder email to the sender of the psl that they include in the registration of the person in charge.

fsodic commented 3 years ago

It might be a problem when the domain does not have a psl for reasons that a dns error coincides with it and dig might detect the absence of the dns as a sign the domain is inactive. So a better solution is to add 2 or 3 files that hold dns data from an existing domain separated into 3 different levels. Orange or Yellow: for a light warning flag Red: for intermediate deletion warnings Black: for the last warning and the last crawl is permanent deletion.

and an additional solution is with the psl which is scheduled for 1 week so that there is a 1 month chance to fix their dns psl.

Another part is when the domain falls into each category there may be a reminder email to the sender of the psl that they include in the registration of the person in charge.

And the auto removal <1 year isn't need anymore when the PSL still active. Every approve merger, make sure the submitter must keep the psl in their dns domain.

i'm sure our PSL be better with this system model.

frknakk commented 3 years ago

We have put in place a requirement on new submissions to the PRIVATE section that 2+ years of service before expiry are required

I ran into a few problems with this requirement that I'd like to mention here:

Of course you can avoid both problems by switching to other TLD's and other providers. I just wanted to mention that it could restrict many people :)

dnsguru commented 3 years ago

We have seen a number of TLDs that do not provide the expiration date in the public whois/rdap. In addition to .de, I have seen this to be the case with .je/.gg, .ru, .nl, and .pl as well, and I am certain there are others we will find. The invoice payment scenario, as well as others that we have not heard of, whereby the 2 year + term may not be possible will certainly exist.

We can make a catch-all, but it doesn't allow for automation to really work... The solution seems to be that the submitting party just acknowledges that their entries can be subject to removal if not maintained through committing to keep the domain name(s) submitted renewed in their PR.

On Wed, Dec 16, 2020 at 6:21 PM Furkan A. notifications@github.com wrote:

We have put in place a requirement on new submissions to the PRIVATE section that 2+ years of service before expiry are required

I ran into a few problems with this requirement that I'd like to mention here:

  • There are TLD's where the expiry date is not publicly retrievable (e.g. german .de domains).
  • Providers that are paid by invoice often don't have an option to specify a domain expiration date. The domain is simply extended every year by 1 year until the contract gets cancelled by the customer.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publicsuffix/list/issues/1119#issuecomment-747161413, or unsubscribe https://github.com/notifications/unsubscribe-auth/AACQTJK4SVPNCHP4X5WGNBTSVFTL5ANCNFSM4SJM3HNA .

pereceh commented 3 years ago

I think the requirements on the wiki are pretty good.

just use the dns _psl records as a benchmark for mass deletion... then give time if the user's DNS server changes to re-create those _psl DNS records (using their old values)...