CrawlScript / WebCollector

WebCollector is an open source web crawler framework based on Java.It provides some simple interfaces for crawling the Web,you can setup a multi-threaded web crawler in less than 5 minutes.
https://github.com/CrawlScript/WebCollector
GNU General Public License v3.0
3.07k stars 1.45k forks source link

关于正则的问题 #57

Closed wuxiongliu1 closed 6 years ago

wuxiongliu1 commented 7 years ago
/*do not fetch jpg|png|gif*/
        this.addRegex("-.*\\.(jpg|png|gif).*");
        /*do not fetch url contains #*/
        this.addRegex("-.*#.*");

为什么这里的正则表示的是not 的意思呢?而不是匹配这些正则规则呢?

Lv9S commented 7 years ago

-表示非