Closed sbv-csis closed 1 year ago
Thanks for spotting this. dnslib does actually parse the data correctly into bytes (as the repr shows) however the problem is when the data is printed. The data is converted into unicode (including the \x00) characters however when the string is printed these are not visible. The RFC is a bit vague however in theory we should only accept ASCII characters in the text representation of TXT records and escape everything else however I think people would now expect to be able to use UTF-8 so we have to be careful about encoding non-printable characters. I have made some changes in the latest version (0.9.20) which should fix this and appears to work with your example however I suspect that the behaviour will be different from DIG in some cases. In most cases it is safer to deal with the raw bytes data in the TXT record if this is important. Let me know if this fixes your problem.
% python -m dnslib.client --server 8.8.8.8:53 smartjailmail.com TXT
...
smartjailmail.com. 3600 IN TXT "google-site-verification=7Avm2jKuluvrgko_FgTUqYqlYpvYu6hMf\005\000\000\000\000\000\000\000DQ"
...
I'm not sure if it's a problem or not, but I've been recently comparing some
dig
output to dnslib output for TXT records and I'm not sure what to think about TXT records with bytes in them that translate to unicode chars - for example:And via dnslib.client
and via
repr
on theRD.data
property:As I read the code dnslib took the bytes and tried to parse it as utf-8 and discards any non-utf8 chars - and
dig
escapes anything outside of ascii perhaps?When I read the RFC
\DDD
indeed is allowed:but I'm not quite sure what to expect of it with regards to dnslib - As I read the RFC encoding of TXT records is not prescribed :shrug: I would worry that a TXT record with some kind of esoteric encoding would break the dnslib way of turning the TXT records into text again