catalyst-cooperative / mozilla-sec-eia

Exploratory development for SEC to EIA linkage

MIT License

0 stars 0 forks source link

Log metrics for generic exhibit 21/10k basic info extraction #33

Open zschira opened 1 month ago

zschira commented 1 month ago

Overview

34 outlines computing/logging metrics for exhibit 21 extraction on the labelled validation set. We also want to track performance on running table extraction on generic filings which we don't have labels for. This will be a minimal set of metrics that will look for some basic success criteria.

Success Criteria

The following metrics are logged on mlflow:

Metrics

[ ] % of filings captured in the SEC database
[ ] % of docs with Ex. 21 that we can create a non-blank Ex. 21 PDF for
[ ] % of filers captured extracted by the "basic info extractor"
[ ] % of docs with Ex. 21 that are captured in the output ownership table
[ ] Null value percentage in the columns of the SEC basic information

katie-lamb commented 3 days ago

@zschira does #34 make this issue obsolete?