bellingcat / EDGAR

Tool for the retrieval of corporate and financial data from the SEC
https://colab.research.google.com/github/bellingcat/EDGAR/blob/main/notebook/Bellingcat_EDGAR_Tool.ipynb
GNU General Public License v3.0
95 stars 12 forks source link

Location based search #22 #28

Closed JackCollins91 closed 3 weeks ago

JackCollins91 commented 1 month ago

Closes Issue #22

edgar-tool now supports parameters inc_in for 'incorporated_in' and peo_in for 'principal executive offices in.'

both parameters takes a string like "NY" or "NY, OH." Any country code in edgar_tool.constants.TEXT_SEARCH_LOCATIONS_MAPPING is supported.

Also, the country names as per TEXT_SEARCH_LOCATIONS_MAPPING.values() are accepted. Whitespace and upper/lower casing should be ignored, and so therefore, "mexico,ny" and "MexicO, Ny" will both convert to "NY,O5"

Malformed input will result in the dictionary for edgar_tool.constants.TEXT_SEARCH_LOCATIONS_MAPPING being printed in the console so the user can see what codes they can use.

I tested the following commands for both correct, incorrect, and unrelated use cases, I have pasted the results of that run in the attached file.

$ poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --peo_in "NY, OH"
$ poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --peo_in "NY"
$ poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --inc_in "NY, OH, O5"
$ poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --inc_in "NY"
$ poetry run poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --peo_in "NY, Korea, Democratic People's Republic of, OH, Mexico"
$ poetry run poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --inc_in "Korea, Democratic People's Republic of"
$ poetry run edgar-tool text_search Tsunami Hazards --output "results.csv" --inc_in "Mexico"
$ poetry run poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --peo_in "mexico"
$ poetry run poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --inc_in "kOrEa, demOcratic people's republic of, OH"
$ poetry run edgar-tool text_search Tsunami Hazards --output "results.csv" --peo_in "hello world"
$ poetry run edgar-tool text_search Tsunami Hazards --output "results.csv" --inc_in 123
$ poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv" --inc_in "NY" --peo_in "NY, OH"
$ poetry run edgar-tool text_search Tsunami Hazards --start_date "2019-06-01" --end_date "2024-01-01" --output "results.csv"

test cli session.txt

JackCollins91 commented 3 weeks ago

Hi @GalenReich, PR is updated per your request. I also updated the initial comment to reflect changes. The attached .txt file shows the results of the latest testing. I added some new test cases which are detailed there also.