ghostery / adblocker

Efficient embeddable adblocker library
https://www.ghostery.com
Mozilla Public License 2.0
809 stars 101 forks source link

Network filter matches only first entry if multiple domains specified #4450

Closed seia-soto closed 3 hours ago

seia-soto commented 3 hours ago

While debugging html filtering functionality, I found that interesting behavior. The same filter with different domains specified didn't work. It had a relationship with domain ordering, so I ran a simple test and found that adblocker engine was not finding hostnames except for the very first entry.

This has to be the first priority issue to resolve since it's widely affecting functionalities including network filtering, html filtering depending on network filtering.

Try to run the following testing file:

import { expect } from 'chai';
import 'mocha';
import {
  detectFilterType,
  FiltersEngine,
  FilterType,
  NetworkFilter,
  Request,
} from '../src/index.js';

it('toString() matches', () => {
  const filters = [
    // The carrot sign looks invalid? ||a.com^,b.com,c.com
    // When parsed, it has `filter` prop of `,b.com,c.com` and `hostname` prop of `a.com`
    `||a.com,b.com,c.com`,
    // The cosmetics should be correct since we index hostname strings
    // `a.com,b.com,c.com##+js(test)`,
  ];
  for (const filter of filters) {
    // Note that `parseFilter` (non-plural) has `debug` flag set (IDK why?)
    // Let's just try to assume filter type
    const type = detectFilterType(filter);
    if (type === FilterType.NETWORK) {
      expect(NetworkFilter.parse(filter)?.toString()).to.be.eql(filter);
    }
  }
});

it('matches network filter with multiple domains', () => {
  function createMainFrameRequest(domain: string) {
    return Request.fromRawDetails({
      url: `https://${domain}/`,
      hostname: `https://${domain}`,
      domain,
      type: 'main_frame',
    });
  }

  const engine = FiltersEngine.parse(String.raw`||a.com,b.com,c.com`);
  // Try to make sample requests here
  // Only `a.com` matched, both `b.com` and `c.com` failed
  for (const request of ['a.com', 'b.com', 'c.com'].map((domain) =>
    createMainFrameRequest(domain),
  )) {
    expect(engine.match(request)).to.have.property('match').to.be.eql(true);
  }
});
seia-soto commented 3 hours ago

Out of spec.