Open michielbdejong opened 9 months ago
I could start by aggregating:
Tools to develop immediately would include how to refer to data sources and maintain code that extracts information from them.
So for instance I should make notes of how I download data exports, and then have tools that:
I don't need to dogfood GUIs! Because I will not be needing them myself anyway. Other people can do that. But I should dogfood Solid Data Modules!
I could start with bank statements, or with GitHub issues for instance.
Should I make a copy of all downloaded information? Probably good, yes. See the 5 steps above: download, streamify, translate, cross-identify, forward
It would be nice to see if I can stabilise my full personal data set, spend a full week only collecting, listing and scraping data sources, both the payloads and their metadata about how I obtained them, into one root index.
I should distinguish between data I already have but whose source has dried up, and recurring data source which I should keep harvesting from on a regular basis; make a list of those.
I've started prototyping this several times, both in JS and in PHP:
Some of the most interesting snippets from that would be:
I am now in the luxurious position where I can start an unfunded software project for the coming years, and make it as big as I want it to be. I think the best place to start is dogfooding with the grooming of my own data downloads - at first manual, then automating it step-by-step, producing small reusable tools along the way.