publicsuffix / list

The Public Suffix List
https://publicsuffix.org/
Mozilla Public License 2.0
1.97k stars 1.2k forks source link

Add `hatenablog.com`, etc. #1948

Closed NanimonoDemonai closed 1 month ago

NanimonoDemonai commented 5 months ago

Public Suffix List (PSL) Pull Request (PR) Template

Each PSL PR needs to have a description, rationale, indication of DNS validation and syntax checking, as well as a number of acknowledgements from the submitter. This template must be included with each PR, and the submitting party MUST provide responses to all of the elements in order to be considered.

Checklist of required steps

Submitter affirms the following:


For Private section requests that are submitting entries for domains that match their organization website's primary domain, please understand that this can have impacts that may not match the desired outcome and take a long time to rollback, if at all.

To ensure that requested changes are entirely intentional, make sure that you read the affectation and propagation expectations, that you understand them, and confirm this understanding.

PR Rollbacks have lower priority, and the volunteers are unable to control when or if browsers or other parties using the PSL will refresh or update.

(Link: about propagation/expectations)

Description of Organization

Hatena Co., Ltd. is the provider of the blog hosting service Hatena Blog. Users of Hatena Blog can create their blogs using subdomains provided by Hatena.

I am an engineer at Hatena.

Organization Website: https://hatena.co.jp/

Reason for PSL Inclusion

Number of users this request is being made to serve: Hatena has 12.3 million[^1] registered users who can create blogs.

We serve users' blogs on the following domains (e.g. example.hatenablog.com)

We want our user's blogs to be isolated from each other.

DNS Verification via dig

dig TXT _psl.hatenablog.com +short
"https://github.com/publicsuffix/list/pull/1948"
dig TXT _psl.hatenadiary.com +short
"https://github.com/publicsuffix/list/pull/1948"
dig TXT _psl.hateblo.jp +short
"https://github.com/publicsuffix/list/pull/1948"
dig TXT _psl.hatenablog.jp +short
"https://github.com/publicsuffix/list/pull/1948"
dig TXT _psl.hatenadiary.jp +short
"https://github.com/publicsuffix/list/pull/1948"
dig TXT _psl.hatenadiary.org +short
"https://github.com/publicsuffix/list/pull/1948"

Results of Syntax Checker (make test)

============================================================================
 Testsuite summary for libpsl 0.21.5
 ============================================================================
 # TOTAL: 5
 # PASS:  5
 # SKIP:  0
 # XFAIL: 0
 # FAIL:  0
 # XPASS: 0
 # ERROR: 0
 ============================================================================

[^1]: The basis for the number of users is Hatena's IR material: 決算説明資料 2024年7月期 第2四半期決算説明会資料

simon-friedberger commented 3 months ago

What is this ads.txt that you are referring to?

Can you please provide the number of users who actually have such a blog? What is the number of active URLs.

simon-friedberger commented 3 months ago
groundcat commented 3 months ago

.jp can be checked here: JPRS WHOIS https://whois.jprs.jp/ (whois.jprs.jp)

Expiration (Note: Must STAY >2y at all times)

groundcat commented 3 months ago

What is this ads.txt that you are referring to?

I believe the ads.txt file the requester mentioned is connected to addressing certain Google AdSense limitations, based on the post below, though I'm not completely certain.

https://bunsho-de-kasegu.com/archives/adsenseadstxt.html

NanimonoDemonai commented 3 months ago

We have extended the domain expiration dates.

The expiration dates for the .jp domains have also been extended, but due to the limitation of the .jp domain registry (JPRS), the expiration date on WHOIS is only updated annually. Therefore, the updated expiration dates cannot be confirmed via WHOIS at this time.

Can you please provide the number of users who actually have such a blog? What is the number of active URLs.

As of May 2024, the number of active subdomains is approximately 300,000 to 3,000,000 per domain. For business reasons, we cannot provide the exact numbers.

What is this ads.txt that you are referring to?

ads.txt refers to https://iabtechlab.com/ads-txt/. It is a file distributed under the eTLD+1 domain of the site, and the PSL is referenced during this process. Although I mentioned it as reference information, the primary reason for our request to add the domains to the PSL is to isolate cookies for each site.

Please review again @simon-friedberger

groundcat commented 1 month ago

All three .com and .org domains expire more than two years from now.

For the three .jp domains, according to the current registry WHOIS record:

However, the .jp registry uses an asynchronous approach of managing the expiry dates in its WHOIS records, as the submitter mentioned, so this might not reflect the actual expiry dates.

image

ドメイン更新後のWhois情報の変更タイミングについてご説明します。

ドメイン有効期限後のWhois情報は、JPRSの特定の仕様に基づき反映されます。この関係で、Whois情報の更新は実際の有効期限を過ぎてから行われます。

具体的には、2023年8月31日に有効期限が切れるドメインの情報は、更新手続き完了が有効期限日前であったとしても2023年9月1日に更新されます。

そのため、更新手続きを有効期限前に完了していても、有効期限前であればWhois上で情報は更新されず、情報更新は有効期限の翌日に自動的に行われます。

This means that the WHOIS database updates the expiration date information the day after the current expiration date passes.

A quick Google search of site: across each of these 6 domains shows a significant number of user sites. Spot-checked a few of them, and they appear to belong to different users—personal blogs or organizational sites, way more than tens of thousands of sites. This aligns with the description under "Reason for PSL Inclusion"; thus, I do see the security implications and the necessity to include them in the PSL for cookie isolation, etc. (The ads.txt third-party limit might not be the submitter's primary intention.)

simon-friedberger commented 1 month ago

TY @groundcat !