Bootstrap starting dataset for scraper

We need seed data for the scraper + diff-er to start running any of our pipeline.

This story takes backtrack URLs, filenames, BankTrack's document data, and our internal metadata structure #9

and Outputs a populated starting {data_structure} object/instance which can be used by other people.

SPIKE FOR THIS:

[ ] Look at pdfs housed on Banktrack's internal server
[ ] Try to programmatically find the webpage on that bank's website that has a link to that pdf

Depends on #9 for knowledge of if this is a json, flat file (parquet?), csv, Graph database, or qbit stored archive.

CorrelAidxNL / BankTrack