It's not entirely clear whether this would be valuable or is needed, but it's an idea I've had for a while and wanted to get it written down so that I could stop thinking about it...
What is wrong
Right now, all of the data that is part of portal tends to be sourced from other places. It would be really convenient/nice to have a piece of infrastructure that gave us the equivalent of a portal archive node, which was able to house all of the data that belongs in the portal network in the format that we use and in the access patterns that we need.
How can this be fixed?
My proposal is that we build a web application that acts as a data warehouse. Data would be accessed by either content-key or content-id, probably via a HTTP/REST API.
History network is going to be the simplest starting point. The application should probably have a relational model of all of the blocks by both number and hash (similar to how Glados holds this data) and then that should be tied relationaly to the associated content-key and content-id entries, with the actual data payloads being stored as binary blobs.
The application should have some simple tooling for populating the database from a reliable data source (like era1 files).
What is wrong
Right now, all of the data that is part of portal tends to be sourced from other places. It would be really convenient/nice to have a piece of infrastructure that gave us the equivalent of a portal archive node, which was able to house all of the data that belongs in the portal network in the format that we use and in the access patterns that we need.
How can this be fixed?
My proposal is that we build a web application that acts as a data warehouse. Data would be accessed by either
content-key
orcontent-id
, probably via a HTTP/REST API.History network is going to be the simplest starting point. The application should probably have a relational model of all of the blocks by both number and hash (similar to how Glados holds this data) and then that should be tied relationaly to the associated content-key and content-id entries, with the actual data payloads being stored as binary blobs.
The application should have some simple tooling for populating the database from a reliable data source (like era1 files).