dgunning / edgartools

Navigate SEC Edgar data in Python
MIT License
516 stars 101 forks source link

convert repeat keys into lists when parsing headers #104

Closed benobytes closed 2 months ago

benobytes commented 2 months ago

This PR aims to improve filing headers for duplicate keys.

Example:

<SEC-DOCUMENT>0001213900-24-076658.txt : 20240906
<SEC-HEADER>0001213900-24-076658.hdr.sgml : 20240906
<ACCEPTANCE-DATETIME>20240906194548
ACCESSION NUMBER:       0001213900-24-076658
CONFORMED SUBMISSION TYPE:  SC 13G
PUBLIC DOCUMENT COUNT:      2
FILED AS OF DATE:       20240906
DATE AS OF CHANGE:      20240906
GROUP MEMBERS:      DANIEL B. ASHER
GROUP MEMBERS:      MITCHELL P. KOPIN       <-- Appears twice

Previously the following would be parsed;

Screenshot 2024-09-07 at 9 10 43 PM

This PR adds support for the following;

Screenshot 2024-09-07 at 9 11 38 PM
dgunning commented 2 months ago

Accepting the fix for handling duplicate keys in headers