meeb / whoisit

A Python library to RDAP WHOIS-like services for internet resources such as ASNs, IPs, CIDRs and domains
BSD 3-Clause "New" or "Revised" License
68 stars 20 forks source link

notice returns as dict #32

Open askkemp opened 2 weeks ago

askkemp commented 2 weeks ago

With v2.7.7, a query against the below domain returns notice as a dict instead of an expected list which causes line 118 of parser.py raise an exception.

whoisit.domain('achimmuenster.info', allow_insecure_ssl=False, raw=False)
...
File /opt/conda/lib/python3.11/site-packages/whoisit/parser.py:118, in Parser.extract_notices(self)
    116 self.parsed['copyright_notice'] = ''
    117 for notice in self.raw_data.get('notices', []):
--> 118     title = clean(notice.get('title', '')).lower()
    119     if title in (
    120             'terms of service', 'terms of use', 'terms and conditions'
    121     ):
    122         links = notice.get('links', [])

AttributeError: 'str' object has no attribute 'get'
...
    ],
    "notices": {
        "title": "401 Unauthorized",
        "description": "Path 'domain': Quota exceeded (31/30) for <ip4 address redacted>",
    },
...

What is unusual is that even though the notice received says the quota is exceeded, it still returns the requested information.

meeb commented 2 weeks ago

Thanks for the issue. This would appear to be because as of the latest release if the primary results from the TLD RDAP endpoint returns a referral related upstream RDAP endpoint that will also be queried as it likely has more information. This is overlayed on the original information to improve details and accuracy. What seems to be happening in this case is the primary RDAP endpoint query is working, but the upstream RDAP query is rate limiting your IP address.

I'll see if it's sensible to either fail the query entirely or if upstream RDAP queries should quietly fail and maybe emit a warning.

In the interim, you should be able to use follow_related=False to disable the related upstream secondary RDAP query which should prevent this error from occurring for you (or wait for your query quota for your IP to expire).