Closed minthemiddle closed 1 year ago
The -f
argument is somewhat similar to the advanced search operators of Google. The difference is that it doesn't accept a value, the value is the search query. Also, the filter is inclusive and it doesn't accept regular expressions. For example, if the search query is "query" and the filter is "url", only links that contain "query" in the URL will be collected - it would be equivalent to Google's advanced search operator "allinurl: query". If you think this feature can be improved, you're very welcome to contribute, or I may do it myself when I have some free time.
How can I filter to exclude two hosts (wikipedia.org and facebook.com)?
According to the docs, filtering is done via
-f
argument.'-f', filter results [url, title, text, host]
is what I find in the script.As
-o json
will output to JSON and is described as'-o', help='output file [html, csv, json]'
, I expected something along the lines of-f host REGEX
but does not work.