Closed msharifbd closed 1 month ago
If you are doing hundreds of filings you will eventually get failures and need to handle in a try catch block. Do that and print the filing that failed.
Thanks for the prompt reply. I think this is the url - https://www.sec.gov/Archives/edgar/data/1811999/000109690624001512/fmhs-20231231.htm
that is breaking. Do you know how to take care of this issue? Should I include something in my code so that it is discarded? If yes, then what I should include in my code to do so?
Add this to your code
if not 'Item 1C' in TenK.items:
continue
Hi, I include your code in the following way -
# Create a list to store the Item 1c text
item1c_texts = []
# Iterate over each filing
for filing in filings2:
url = filing.document.url
cik = filing.cik
filing_date = filing.header.filing_date,
reporting_date = filing.header.period_of_report,
comn = filing.company
# Extract the text for Item 1c
TenK = filing.obj()
if not 'Item 1C' in TenK.items:
continue
item1c_text = TenK['Item 1C']
item1c_texts.append({
'CIK': cik,
'Filing Date': str(filing_date),
'Item 1c Text': item1c_text,
'url': url,
'reporting_date': str(reporting_date),
'comn': comn
})
# Create a DataFrame from the Item 1c text data
item1c_df = pd.DataFrame(item1c_texts)
but still it shows the same errors.
can you give some code in which I can ignore those ciks or urls in which it breaks?
Hi, Finally, it worked. I got help from Stack overflow. Thank you very much for your help and for your nice module. Here is the code that worked for me -
# pip install edgartools
import pandas as pd
from edgar import *
# Tell the SEC who you are
set_identity("My Name myemail@outlook.com")
filings2 = get_filings(form='10-K', amendments=False,
filing_date="2024-03-01:2024-03-31")
filings2_df = filings2.to_pandas()
# Create a list to store the Item 1c text
item1c_texts = []
for n, filing in enumerate(filings2):
url = filing.document.url
cik = filing.cik
filing_date = filing.header.filing_date,
reporting_date = filing.header.period_of_report,
comn = filing.company
# Extract the text for Item 1c
TenK = filing.obj()
# Bypass None values
try:
item1c_text = TenK['Item 1C']
except:
item1c_text = None
# Append the data to the list
# item1c_text = TenK['Item 1C']
item1c_texts.append({
'CIK': cik,
'Filing Date': str(filing_date),
'Item 1c Text': item1c_text,
'url': url,
'reporting_date': str(reporting_date),
'comn': comn
})
# Create a DataFrame from the Item 1c text data
item1c_df = pd.DataFrame(item1c_texts)
OK great. Thanks
Hi, I am trying to parse Item 1C from 10-K. I use a for loop to do so. However, I got an error message. Here is my code -
The error it shows -
Please note that if I use the filing_date="2024-07-12:", it works fine, but I need to collect Item 1C since December 15, 2023. Do you have idea why this is happening? Thanks.