bellingcat / EDGAR

Tool for the retrieval of corporate and financial data from the SEC
https://colab.research.google.com/github/bellingcat/EDGAR/blob/main/notebook/Bellingcat_EDGAR_Tool.ipynb
GNU General Public License v3.0
95 stars 12 forks source link

Searching by category is broken. #23

Closed GalenReich closed 1 month ago

GalenReich commented 3 months ago

Currently, the category argument is passed to the API as a URL parameter derived from the TEXT_SEARCH_FILING_CATEGORIES_MAPPING:

&category=form-cat3

which doesn't appear to affect the search results.

Instead the TEXT_SEARCH_CATEGORY_FORM_GROUPINGS should be used to set the forms URL parameter instead.

&forms=10-12B,10-12G,18-12B,20FR12B, ...

Additionally, perhaps the users should be able to set the forms directly - either providing a Category from the mapping, or a list of forms of interest.

NauelSerraino commented 2 months ago

Hi, can I try to solve this? Thanks :)

GalenReich commented 2 months ago

Hi @NauelSerraino, that would be great, thank you! 🙌 I'll assign this issue to you. Hopefully there isn't any hidden complexity, but feel free to share any issues you have, and do tag me in your PR when it's ready. I can also unassign you if you need to stop working on it for any reason.

NauelSerraino commented 1 month ago

Hi @GalenReich I'm working on the issue, since the TEXT_SEARCH_FILING_CATEGORIES_MAPPING does not affect the search results, is it ok to simple substitute the TEXT_SEARCH_CATEGORY_FORM_GROUPINGS inside the filing_type?

Or is more reasonable to create a new argument within the text search?

I've currently adopted the latter and the text_search works fine with both &category=... and &forms=... evaluated

GalenReich commented 1 month ago

I couldn't see any effects of the category parameter, and it sounds like you haven't either, which gives me confidence that it isn't doing anything, so I'd be happy with its removal!