AdguardTeam / cname-trackers

This repository contains a list of popular CNAME trackers
https://adguard.com/
MIT License
386 stars 37 forks source link

combined_disguised_microsites_rpz.txt broken for unbound #95

Closed slavkoja closed 2 months ago

slavkoja commented 7 months ago

Hi,

some recent days i see in unbound (1.17.1) log (wrapped):

error: raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_microsites_rpz.txt \
   parse failure RR[256]: Domainname length overflow in \
   'api.agent.ip.datadoghq.oohmediaomprooohmedialient.agent.datadoghq.usage-oohmediaomprooohmediaessprofiles.agent.datadoghq.datadoghq-enoohmediarypted.datadoghq.oohmediaomprooohmediaessprofiles.datadoghq.oohmediaomprooohmediaesspage.api.agent.datadoghq.com CNAME .'

The result is, that this RPZ zone was not updated from last change.

Quick investigation shows, that this line have to be OK, as domain name is 253 chars long, but it seems that unbound parser is broken somewhat and it counts whole line length (which is 261 chars) for RPZ. I afraid, that i will be not fixed in unbound soon (if at all), please can you take into it and limit RPZ's formats to fit max line length to 256?

note: more lines can be affected, unbound reports only first...

slavkoja commented 7 months ago

BTW, it seems that api.agent.datadoghq.com has wildcard record:

dig +noall +answer cname xyz.api.agent.datadoghq.com
xyz.api.agent.datadoghq.com. 599 IN CNAME   metrics.agent.datadoghq.com.

regards

hagezi commented 7 months ago

Regex for extracting valid domains for RPZ: (?=^.{4,244}$)(^(?:[a-zA-Z0-9](?:(?:[a-zA-Z0-9\-]){0,61}[a-zA-Z0-9])?\.)+([a-zA-Z]{2,}|xn--[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])$)

slavkoja commented 6 months ago

I am not sure if i understand you. The RPZ is updated directly from unbound:

rpz:
    url:            https://raw.githubusercontent.com/AdguardTeam/cname-trackers/master/data/combined_disguised_microsites_rpz.txt
    zonefile:       /var/lib/unbound/rpz/microsite.adguard.lan.zone
    ...

AFAIK, the zone & its file is updated after zone fetched from url parsing by unbound, and that parsing fails...

jellizaveta commented 2 months ago

After fix, long rules are currently excluded from the RPZ file build, for example:

api.agent.datadoghq.252fwww2ompro252fwww2lient.agent.datadoghq.usage-252fwww2ompro252fwww2essmgo.agent.datadoghq.datadoghq-confluence.datadoghq.drive-252fwww2ompro252fwww2essmgo.datadoghq.bacjaase-oohmediaomprooohmediaessclaeepls.api.agent.datadoghq.com CNAME .

Task closed.