Closed pirxthepilot closed 2 years ago
From initial digging, looks like both dyndns.org
and go.dyndns.org
are in Mozilla's public suffix list file. Having discovered that list I'm now not sure why there are a lot of entries in the list that don't look like TLDs. Am I missing something?
Hi @pirxthepilot - Thanks for reaching out!
You're right - dyndns.org
as well as go.dyndns.org
are on the Mozilla public list file. This list is community curated and tracks effective TLDs. This means domains under which multiple parties that are unaffiliated with the operator of the domain may register subdomains [1].
In this specific case, it seems like dyn allows their customers to register their own subdomains. They act as the registrar for that domain.
So in the context of the list, the result seems correct. Of course, the IANA list, which is used by default, would extract org
as ut_tld
. However, the IANA list only contains TLDs with no dots (eg. co.uk
is not included). But in a technical sense, there seems to be no inherent difference between the registrar of co.uk
and dyndns.org
- they both allow registration their below their domain.
Does this make sense? Please let me know your thoughts. Of course, you could take advantage of the custom list feature and provide your own curated list.
[1] https://www.icann.org/en/system/files/files/octo-011-18may20-en.pdf
Hi @dfederschmidt , thanks for providing context, much appreciated! I think I understand now. Having thought about it more I think the Mozilla public list vs the "intuitive" behavior (e.g. dyndns.org
as a domain) are both valid, depending on the context of the search.
That said, it can be confusing to a Splunk user who is not aware of effective TLDs and so writing their searches incorrectly (at least for the domains included in the Mozilla list). I think (but could be wrong) most folks would expect similar results between www.wikipedia.org
and www.dyndns.org
in that ut_tld
is org
and ut_domain
are wikipedia.org
and dyndns.org
respectively. (The IANA list would have been nice, but like you said, it only supports one level.)
Curious to know what you think as well! Would it be worth having a separate list that does not include effective TLDs?
Also, if we were to use our own list, do we need to fork this app and install our own fork, or can the list be maintained in Splunk e.g. with a lookup table?
Thanks!
HI @pirxthepilot
I understand that this behaviour may look unintuitive. But actually, the default for url_parse_extended
is the IANA list, which does only contain one level of actual TLDs and should be "sane". The apps documentation regarding this command states the difference and implications when switching to the mozilla list.
Curious to know what you think as well! Would it be worth having a separate list that does not include effective TLDs?
The IANA list is actually the list without effective TLDs. Domains like gov.uk
or co.uk
are effective TLDs.
Technically, the .uk
is the proper country-code TLD, even though most people probably want co.uk
extracted.
Using your own list would mean forking the app and editing /bin/suffix_list_custom.dat
. I agree that this is not ideal, especially in cloud environments and a configuration mechanism via lookup would be more comfortable. I'll add this as a potential enhancement in a dedicated issue.
Added #5 - feel free to add your thoughts to the issue.
@dfederschmidt sorry for the delayed response. Your explanation made a lot of sense, and I think it's just a matter of expectation and use case as to what the user wants to accomplish. For that matter, having an easily configurable custom list would be really useful, so thanks for opening #5!
EDIT: Pulled a request to include a new custom list that ships with utbox. I think this will just cause additional confusion and we're better off having custom lists through #5. Will just go ahead and close this issue. Thanks!
Seeing some weird behavior when parsing
dyndns.org
FQDNs withut_parse_extended
and the mozilla list.Expected Behavior:
With an FQDN like
foo.bar.google.com
, it correctly shows the tld, domain, subdomain and number of subdomain elements.Issue
Parsing
foo.bar.dyndns.org
, ut seems to think that the TLD isdyndns.org
and the domain isbar.dyndns.org
Even weirder, with
foo.go.dyndns.org
, ut parsesgo.dyndns.org
as the TLD, andfoo.go.dyndns.org
as the domain.