[ ] Mechanism in CI/server for taking extractor definition and trying to run it on example data that matches the input format
[ ] Rendering a nice UI
[ ] Storage in a more persistent database (might not even be needed for a long time -- currently just using a in-memory fake MongoDB that can be deployed serverless and will do very slow free-text search over our small number of entries)
As discussed with @PeterKraus and @unkcpz in today's office hours.
This just covers the "registered file types/extractors -> web API pipeline". Many things missing: