Closed GalenReich closed 7 months ago
Great news, just tested it, it already works with current implementation:
python3 main.py text_search \"John Doe\" Pharmaceuticals -o "results_test.csv" --min_wait 3.0 --max_wait 4.0 -r 3 -b "chrome"
Will result in searching "John Doe" Pharmaceuticals
in the search page, it's just a matter of escaping quotes.
That being said, should we consider removing the exact_search
flag ? We don't need it if we document properly this way to perform exact search.
Hey :) I am working on this at the moment, I am suggesting removing the --exact_search
flag in favor of the inline syntax e.g. \"John Doe\"
only, because both can interact in an unexpected way and result in less precise searches than intended by the user, to reproduce you can check the difference in search parameters on the page between:
# With inline exact search keywords and exact_search = False
python3 main.py text_search \"John Doe\" Pharmaceuticals -o "results_test.csv" --min_wait 3.0 --max_wait 4.0 -r 2 -b "chrome" -h False --exact_search False
# With inline exact search keywords and exact_search = True
python3 main.py text_search \"John Doe\" Pharmaceuticals -o "results_test.csv" --min_wait 3.0 --max_wait 4.0 -r 2 -b "chrome" -h False --exact_search
For the first case the search text is as expected "John Doe" Pharmaceuticals
However when inline exact search keywords and exact_search
flag are both used, an extra quote is added in the process, hence the rendered search text is ""John Doe" Pharmaceuticals
(not sure why the closing quote is not added there), which returns broader results (checked on a few test searches).
I've updated the README and proposed a removal of exact_search
flag in this PR: https://github.com/bellingcat/EDGAR/pull/16
Currently keyword search is either exact (keyword order must match for all keywords) or inexact (and order of keywords may match).
It would be good if searches could use a mix of exact and inexact keyword matches:
i.e. "John Doe" Pharmaceuticals