Open andrewsutjahjo opened 2 years ago
We need seed data for the scraper + diff-er to start running any of our pipeline.
This story takes backtrack URLs, filenames, BankTrack's document data, and our internal metadata structure #9
and Outputs a populated starting {data_structure} object/instance which can be used by other people.
SPIKE FOR THIS:
Depends on #9 for knowledge of if this is a json, flat file (parquet?), csv, Graph database, or qbit stored archive.
We need seed data for the scraper + diff-er to start running any of our pipeline.
This story takes backtrack URLs, filenames, BankTrack's document data, and our internal metadata structure #9
and Outputs a populated starting {data_structure} object/instance which can be used by other people.
SPIKE FOR THIS:
Depends on #9 for knowledge of if this is a json, flat file (parquet?), csv, Graph database, or qbit stored archive.