Closed jwahsnakupaku closed 1 week ago
Yeah this is because in your example the address is just stuffed into a label
string which is pretty much impossible to reliably parse rather than have them split out properly in the vcard. Different localities have different address formats. For example, these are the adr
element label
s for 1.1.1.1
(all of these are returned from whoisit.ip('1.1.1.1', raw=True)
):
'6 Cordelia St South Brisbane QLD 4101'
'6 Cordelia St'
'PO Box 3646\nSouth Brisbane, QLD 4101\nAustralia'
Basically trying to parse these into a sensible street, locality postcode etc. is near impossible so the library doesn't bother attempting it.
With IP lookups if you need the address your best bet is probably to just use raw=True
and look for adr
label
s yourself manually unfortunately.
Would it be difficult to include the raw, unparsed label alongside the structured data?
No, but that would go against what the parsed output was meant to do in the first place and what raw=True
is for. If you need the raw data use that.
Makes sense to me. It's unfortunate that the RIR implementation is so lacking here, I have to imagine they could at least try to map some of these fields properly into the vcard.
Yeah it would be nice the data was all parsed and segregated correctly with the correct labels. My irritation with the weird formats used by different RDAP endpoints is why the parser module in whoisit
is as expansive as it is already.
Are you OK if I close this or would you like to raise anything else?
Nah close it off, I'll just grab the raw data parse it manually and shove that address string in somewhere.
ip='8.8.8.8'
raw = whoisit.ip(ip, raw=True)
praw = whoisit.parse(whoisit._bootstrap, 'ip', ip, raw)
Cheers for your help/.
No problem. If RDAP endpoints ever do start returning addresses in a sane format it'll get added to the whoisit
parser.
Hi,
IP Address contacts don't appear to be getting parsed. Looks like they aren't in the expected fields? Not sure if this is common to all IP records or just the 10 or so I've looked at.
eg; for 8.8.8.8 - https://rdap.arin.net/registry/entity/ABUSE5250-ARIN
Might be able to check if the vcard it makes from the expected field is empty and then try to parse the entry_data field like so.
Could be a bit dodgy as \n separated data might vary?