Consider the following URL pattern: https?://(www\.)?toto\.com/media/[^"]+
It sometimes happen pages reference absolute ou relative links on the page. As an example, toto.com may have page that only contain /media/test.jpg or ../../media/test.jpg.
In such cases, HG ++ does not find anything.
The search pattern becomes invalid.
There are 3 solutions to this.
Introduce a new property in the dictionary to specify the search pattern is restricted to a given host. Here, the restriction would be about the toto.com host, and the search pattern would become .(*/)?media/^"]+.
Implement a smart guess to deduce relative patterns from a global one.
Depending on the search pattern, it might be complicated.
Rework the dictionary.
Add a domain property. And rename the URL pattern to path pattern.
With the third option, it becomes much more simple to handle all these cases.
In a given page, we build an URL pattern from both the domain and path pattern.
If the current URL matches the domain, we can also directly search for the path domains.
Code modifications are not the most complicated.
We will have to rewrite the dictionary.
Consider the following URL pattern:
https?://(www\.)?toto\.com/media/[^"]+
It sometimes happen pages reference absolute ou relative links on the page. As an example, toto.com may have page that only contain/media/test.jpg
or../../media/test.jpg
.In such cases, HG ++ does not find anything. The search pattern becomes invalid.
There are 3 solutions to this.
Introduce a new property in the dictionary to specify the search pattern is restricted to a given host. Here, the restriction would be about the
toto.com
host, and the search pattern would become.(*/)?media/^"]+
.Implement a smart guess to deduce relative patterns from a global one. Depending on the search pattern, it might be complicated.
Rework the dictionary. Add a domain property. And rename the URL pattern to path pattern.
With the third option, it becomes much more simple to handle all these cases.
Code modifications are not the most complicated. We will have to rewrite the dictionary.