HumanSignal / label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format
https://labelstud.io
Apache License 2.0
17.62k stars 2.19k forks source link

Support for XBRL and iXBRL #284

Open slavakurilyak opened 4 years ago

slavakurilyak commented 4 years ago

Is your feature request related to a problem? Please describe. Label Studio does not support XBRL (eXtensible Business Reporting Language) or iXBRL (Inline XBRL) input formats.

Describe the solution you'd like As a data labeler, I want to import XBRL and iXBRL files as tasks, so I can better analyze the U.S. Securities and Exchange Commission (SEC) data.

Describe alternatives you've considered I assume templating for XBRL/iXBRL Document NER is similar to HTML Document NER.

Additional context For Abbott Laboratories, here is the iXBRL document and equivalent HTML document.

makseq commented 4 years ago

@slavakurilyak Do you know any converters from XBRL to HTML? Is it ok to get HTML elements positions and offsets as NER labeling results instead of XBRL positions? Because it's not trivial to support XBRL directly.

slavakurilyak commented 4 years ago

@makseq I opened an issue with Pandoc here. I assume that treating XBRL as an HTML page is the easiest way to support this file format.