dgunning / edgartools

Navigate SEC Edgar data in Python
MIT License
411 stars 83 forks source link

When using for loop, it shows "cannot unpack non-iterable NoneType object" to parse Item 1C #70

Closed msharifbd closed 1 month ago

msharifbd commented 1 month ago

Hi, I am trying to parse Item 1C from 10-K. I use a for loop to do so. However, I got an error message. Here is my code -

import pandas as pd
from edgar import *

# Tell the SEC who you are
set_identity("Your Name youremail@outlook.com")
filings2 = get_filings(form='10-K', amendments=False, filing_date="2024-07-10:") 

# Create a list to store the Item 1c text
item1c_texts = []

# Iterate over each filing
for filing in filings2:
    url = filing.document.url
    cik = filing.cik
    filing_date = filing.header.filing_date,
    reporting_date = filing.header.period_of_report,
    comn = filing.company

    # Extract the text for Item 1c
    TenK = filing.obj()
    item1c_text = TenK['Item 1C']

    item1c_texts.append({
        'CIK': cik,
        'Filing Date': str(filing_date),
        'Item 1c Text': item1c_text,
        'url': url,
        'reporting_date': str(reporting_date),
        'comn': comn
    })

# Create a DataFrame from the Item 1c text data
item1c_df = pd.DataFrame(item1c_texts)

The error it shows -

TypeError: cannot unpack non-iterable NoneType object
Cell In[707], line 9
      7 # Extract the text for Item 1c
      8 TenK = filing.obj()
----> 9 item1c_text = TenK['Item 1C']
     10 item1c_texts.append({
     11     'CIK': cik,
     12     'Filing Date': str(filing_date),
   (...)
     16     'comn': comn
     17 })
Show Traceback

Please note that if I use the filing_date="2024-07-12:", it works fine, but I need to collect Item 1C since December 15, 2023. Do you have idea why this is happening? Thanks.

dgunning commented 1 month ago

If you are doing hundreds of filings you will eventually get failures and need to handle in a try catch block. Do that and print the filing that failed.

msharifbd commented 1 month ago

Thanks for the prompt reply. I think this is the url - https://www.sec.gov/Archives/edgar/data/1811999/000109690624001512/fmhs-20231231.htm

that is breaking. Do you know how to take care of this issue? Should I include something in my code so that it is discarded? If yes, then what I should include in my code to do so?

dgunning commented 1 month ago

Add this to your code

if not 'Item 1C' in TenK.items:
    continue
msharifbd commented 1 month ago

Hi, I include your code in the following way -

# Create a list to store the Item 1c text
item1c_texts = []

# Iterate over each filing
for filing in filings2:
    url = filing.document.url
    cik = filing.cik
    filing_date = filing.header.filing_date,
    reporting_date = filing.header.period_of_report,
    comn = filing.company

    # Extract the text for Item 1c
    TenK = filing.obj()
    if not 'Item 1C' in TenK.items:
        continue
    item1c_text = TenK['Item 1C']

    item1c_texts.append({
        'CIK': cik,
        'Filing Date': str(filing_date),
        'Item 1c Text': item1c_text,
        'url': url,
        'reporting_date': str(reporting_date),
        'comn': comn
    })

# Create a DataFrame from the Item 1c text data
item1c_df = pd.DataFrame(item1c_texts)

but still it shows the same errors.

msharifbd commented 1 month ago

can you give some code in which I can ignore those ciks or urls in which it breaks?

msharifbd commented 1 month ago

Hi, Finally, it worked. I got help from Stack overflow. Thank you very much for your help and for your nice module. Here is the code that worked for me -

# pip install edgartools
import pandas as pd
from edgar import *

# Tell the SEC who you are
set_identity("My Name myemail@outlook.com")
filings2 = get_filings(form='10-K', amendments=False, 
filing_date="2024-03-01:2024-03-31") 

filings2_df = filings2.to_pandas()
# Create a list to store the Item 1c text
item1c_texts = []

for n, filing in enumerate(filings2):
    url = filing.document.url
    cik = filing.cik
    filing_date = filing.header.filing_date,
    reporting_date = filing.header.period_of_report,
    comn = filing.company

    # Extract the text for Item 1c
    TenK = filing.obj()

    # Bypass None values
    try:
        item1c_text = TenK['Item 1C']
    except:
        item1c_text = None

    # Append the data to the list
    # item1c_text = TenK['Item 1C']
    item1c_texts.append({
        'CIK': cik,
        'Filing Date': str(filing_date),
        'Item 1c Text': item1c_text,
        'url': url,
        'reporting_date': str(reporting_date),
        'comn': comn
    })

# Create a DataFrame from the Item 1c text data
item1c_df = pd.DataFrame(item1c_texts)
dgunning commented 1 month ago

OK great. Thanks