kanedata / ixbrl-parse

A python library for getting useful data out of ixbrl files.
https://ixbrl-parse.readthedocs.io/
MIT License
63 stars 24 forks source link

NotImplementedError: Format "numwordsen" not implemented (namespace "ixt-sec") #17

Closed ajmarks closed 3 years ago

ajmarks commented 3 years ago

Windows, Anaconda Python 3.8.5.

Code to reproduce:

import io
import requests
from ixbrlparse import IXBRL

url = 'https://www.sec.gov/Archives/edgar/data/72333/000007233320000195/jwn-20200801.htm'
r = requests.get(url)
xbrl = IXBRL(io.StringIO(r.text))

Stack trace:

Traceback (most recent call last):
  File "<input>", line 7, in <module>
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\core.py", line 16, in __init__
    self._get_numeric()
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\core.py", line 97, in _get_numeric
    ixbrlNumeric(element)
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\components\numeric.py", line 34, in __init__
    self.format = get_format(format_['format_'])(**format_)
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\components\transform.py", line 91, in get_format
    raise NotImplementedError(
NotImplementedError: Format "numwordsen" not implemented (namespace "ixt-sec")
drkane commented 3 years ago

Hi - thanks for the issue, this is something that wasn't implemented as I hadn't come across it before. I've used a python package to convert the words to numbers. If you upgrade to 1.1.3 then it should work.

ajmarks commented 3 years ago

Thanks so much! But now I'm getting a new error on the same snippet:

{'text': 'no', 'context': <IXBRLContext i5a7c723bf81446bf9a00f2cc32d2a4e0_I20200801 [2020-08-01] (with segments)>, 'unit': 'iso4217:USD', 'unitRef': 'usd', 'contextRef': 'i5a7c723bf81446bf9a00f2cc32d2a4e0_I20200801', 'decimals': '-6', 'format': 'ixt-sec:numwordsen', 'name': 'us-gaap:LineOfCredit', 'scale': '6', 'id': 'id3VybDovL2RvY3MudjEvZG9jOjk3NmExNTY5YWVkNTRlMjY4ZTJmNWIzYmEzZmEzMjdhL3NlYzo5NzZhMTU2OWFlZDU0ZTI2OGUyZjViM2JhM2ZhMzI3YV80OS9mcmFnOmYyOWIyN2RlOTYwMTQxM2M4MjE4YTM4N2JlMTkxY2ZiL3RleHRyZWdpb246ZjI5YjI3ZGU5NjAxNDEzYzgyMThhMzg3YmUxOTFjZmJfMjgxNA_17e92a8b-43b5-49a4-b529-f8f3d6299ec9'}
Traceback (most recent call last):
  File "<input>", line 7, in <module>
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\core.py", line 16, in __init__
    self._get_numeric()
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\core.py", line 97, in _get_numeric
    ixbrlNumeric(element)
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\components\numeric.py", line 37, in __init__
    self.value = self.format.parse_value(self.text)
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\ixbrlparse\components\transform.py", line 68, in parse_value
    return w2n.word_to_num(value)
  File "C:\Users\AMarks\Anaconda3\envs\datascience\lib\site-packages\word2number\w2n.py", line 154, in word_to_num
    raise ValueError("No valid number words found! Please enter a valid number word (eg. two million twenty three thousand and forty nine)")
ValueError: No valid number words found! Please enter a valid number word (eg. two million twenty three thousand and forty nine)

The tag in question appears to be:

<ix:nonFraction unitRef="usd" contextRef="..." decimals="-6" format="ixt-sec:numwordsen" name="us-gaap:LineOfCredit" scale="6" id="id3VybD...">no</ix:nonFraction>

Based on the SEC's XBRL manual p. 5-34, it looks like "no" and "None" are officially blessed special cases for 0.

drkane commented 3 years ago

Thanks for the PR @ajmarks - looks great. I've added some test cases too to check it works, and made sure that the text is lowercase. Let me know if there's any more edge cases it doesn't work for.