dhowe / AdNauseamV1

*** This is not the current AdNauseam repository: please find the current repo here:
https://github.com/dhowe/AdNauseam
GNU General Public License v3.0
385 stars 33 forks source link

Lack of regex for Google Ad text Ad in some sites #312

Closed CyrusSUEN closed 7 years ago

CyrusSUEN commented 9 years ago

The Google Ad of the following sites sometimes shows text ad instead of image ad. In case of showing text ads, ADN cannot identify them.

http://www.foxsports.com/ http://www.pcworld.com/ http://www.dailymotion.com/tw

e.g. on foxsports 3 title selector: .rh10c text selector: .rh11c site selector: .rh1110c

This attached file is the saved webpage of foxsports.

CyrusSUEN commented 9 years ago

Luckily there are similarities among the text ads in the above sites. The following text ad has been found on http://www.dailymotion.com/tw

<table>
    <tbody>
        <tr>
            <td class="rh1c">
                <div class="rh1">
                    <table>
                        <tbody>
                            <tr>
                                <td class="rh10c">
                                    <div class="rh-box-title rh-title rh10 collapsed-box"><a data-original-click-url="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" class="rhtitle rhdefaultcolored" href="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" target="_top" title="formwelkin.com/e-marketing"><span>E-commerce 專業建站課程</span></a>
                                    </div>
                                </td>
                            </tr>
                            <tr>
                                <td class="rh11c">
                                    <div class="rh-box-multiframe rh11 rh-multiframe frame0" data-num-frames="2" data-multiframe-type="normal">
                                        <div class="rh110">
                                            <table>
                                                <tbody>
                                                    <tr>
                                                        <td class="rh1100c">
                                                            <div class="rh1100">
                                                                <table>
                                                                    <tbody>
                                                                        <tr>
                                                                            <td class="rh11000c">
                                                                                <div class="rh-body rh-box-body rh11000"><span class="rhbody rhdefaultcolored">Wordpress, PayPal建立網上行銷平台 CEF認可資助課程,按我了解</span>
                                                                                </div>
                                                                            </td>
                                                                        </tr>
                                                                    </tbody>
                                                                </table>
                                                            </div>
                                                        </td>
                                                    </tr>
                                                </tbody>
                                            </table>
                                        </div>
                                        <div class="rh111">
                                            <table>
                                                <tbody>
                                                    <tr>
                                                        <td class="rh1110c">
                                                            <div class="rh-box-url rh-url rh1110">
                                                                <div class="rhurlctr" dir="ltr"><a data-original-click-url="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" class="rhfavicon" title="formwelkin.com/e-marketing" href="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" target="_top"><img alt="" src="http://t1.gstatic.com/favicon?q=tbn:ANd9GcRqEfUhSgMMLOwPH6stvXf0OjNbvabCtvmUWnxqF2Viwg3MJsugrWUOjEGUkNrKdzS1bNhbFPm-yNIpxuE" height="16"></a><a data-original-click-url="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" class="rhurl rhdefaultcolored" title="formwelkin.com/e-marketing" href="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" target="_top"><span>formwelkin.com/e-marketing</span></a>
                                                                </div>
                                                            </div>
                                                        </td>
                                                    </tr>
                                                </tbody>
                                            </table>
                                        </div>
                                    </div>
                                </td>
                            </tr>
                            <tr>
                                <td class="rh12c">
                                    <div class="rh-box-breadcrumbs rh12 bcactive0">
                                        <div class="target target0 bcfirst" data-bc-index="0">
                                            <div class="unit" data-bc-index="0"></div>
                                        </div>
                                        <div class="target target1 bclast" data-bc-index="1">
                                            <div class="unit" data-bc-index="1"></div>
                                        </div>
                                    </div>
                                </td>
                            </tr>
                        </tbody>
                    </table>
                </div>
            </td>
            <td class="rh2c">
                <div class="rh-box-empty rh2"></div>
            </td>
            <td class="rh3c">
                <div class="rh-box-button rh-nessie-button-flat rh3">
                    <div class="rhbutton-container"><a data-original-click-url="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" class="rhbutton" href="http://www.googleadservices.com/pagead/aclk?sa=L&amp;ai=CSAjfgjBWVanQN8628AWNioKIAt-U950G1-TsmtsBvY2FkgEQASD5qpUXYNkCoAGfwartA8gBAakC00ICmQ5RgT7gAgCoAwHIA5sEqgSZAU_QCb3itPsgqOJDqcRcrlwQZW1ZSmzhQieFn9pA8_ESoh3g4x3fFi63NDYfv3ytMkd09zZHDnjANih35tbZKBdZAuz2iD-NNiTSdV7oHcWGMnXXM6WB7dnS-8m9oe_wwd0TTfN-H41nddwPJdB-uvnGBeR9qkZMqb8VVo1kxRHdn27qxA7VNRR5XWJFuDkGr9L5z3JVw2EMVuAEAYgGAYAHyb7VEtgHAQ&amp;num=1&amp;cid=5GjGQ6f8SOc_LFkRGDzZHsKJ&amp;sig=AOD64_1kCXQtu5TALEsje3IYJBnMffvR2A&amp;client=ca-pub-7019376976432612&amp;adurl=http://formwelkin.com/e-marketing" target="_top" title="formwelkin.com/e-marketing"><div class="icon-container"><img alt="" class="icon" src="http://pagead2.googlesyndication.com/pagead/images/nessie_icon_thin_arrow_big_white.png"></div></a>
                    </div>
                </div>
            </td>
            <td class="rh4c">
                <div class="rh-box-empty rh4"></div>
            </td>
        </tr>
    </tbody>
</table>

It has the same selectors as the ones on http://www.foxsports.com/ clipboard02

But I think we should resort to this https://github.com/dhowe/AdNauseam/issues/27 when ADN fails on importing certain ads to the menu.

dhowe commented 9 years ago

Why do you suggest this? Are you talking about "importing" or parsing text-ads? Why not just create an elemhide.js rule that covers all cases like the above?

CyrusSUEN commented 9 years ago

I've added the rule to elemhide.js but I'm not sure if it will work. Just like https://github.com/dhowe/AdNauseam/issues/200 we have rules but it fails to load the text ad.

Another difficulty is that It's not easy to test because not every time the site will give me a text ad.