This PR introduces a new daemon mindexd responsible for collecting external data (Powergates and Filecoin chain) and building a miner's index. The UI for the Miner's Index is still not defined, which can be understood as the backend for this subsystem.
This PR stands on the shoulders of this Powergate PR. Reviewing doesn't need to understand those changes since I tried to be quite verbose in the comments to be self-contained.
I describe big ideas in different sections, but I give more details in PR comments!
Since this is a new daemon, all this work is independent of other daemons hubd, userd, etc. The idea of this daemon is not to have a publicly accessible API, so whenever the UI is defined, we need to route some needed APIs in some publicly facing daemon to call APIs of this one.
The miner's index is a mixture of:
Onchain data, such relative power in the network and sector sizes.
Offchain data, such as general and verified ask prices, minimum/maximum piece sizes, etc.
Storage and Retrieval Records from Powergate targets. These Records have information about successful and failed deals/retrievals, with some extra information such as StartDataTransfer, EndDataTrasnfer, StartSealingTime, EndSealingTime, etc.
The main idea is to gather all this data, and build a useful miner index with it.
All the Powergate related data about importing records, is done in two layers.
The first layer, is a collector component which asks registered Powergate instances to provide all created/modified storage/retrieval records that have happened since time X. This delta-style importing allows to poll in an efficient way with a small bandwidth cost.
All imported records are merged in a single collection, thus having a complete view of all deals and retrievals made by multiple Powergates. This will be the source layer from which construct more meaningful metrics for the index. Also, since we have always this raw records, we can keep experimenting creating multiple different metrics.
In summary, the collector maintains an up-to-date collection of all the deals and retrieval information from multiple Powergates. To quickly see what these records store, see this model.
On top of the previous one, the second layer is the indexer component, which will build more meaningful metrics. For example:
Get last successful transfer throughput for a miner: Filter records from miner, which are successful, and calculate Record.DataSize / (Record.DataTransferEnd - Record.DataTransferStart)
Get last deal failure for a miner: Get the last record for a miner with Failed = true , and get the records date.
etc
All this system allows importing Powergate external information and process it to generate Textile related metrics.
It is worth noting that every imported data from a Powergate instance is tagged with a Region field. Metrics can be created per region since it wouldn't be fair to mix metrics from Powergate in the USA and Powergate in China, with some miners in China. Most probably, the Chinese miner will have better metrics for Chinese Powergates.
Regarding on-chain metrics, such as miner's:
Ask price
Verified ask price
Sector size
Min and max piece size
etc
The indexer leverages the Powergate stack from the Hub and the newly created Indices APIs to fetch this information from Powergates indices.
The full model (with on-chain and Textile data) can be found here. This is the result that indexer is keeping up to date.
With a specified (tunable) frequency, the indexer will recalculate values for the index to consider new data from on-chain Powergate indices or newly imported records from external Powergates.
It also makes a daily (could be tunable) snapshot of the Index data in a separate history collection to avoid bloating the current index collection. (Thus, possibly affecting queries and indexes). Having a daily snapshot of the index values could allow building history/plotting/rate metrics. For example, we could consider plotting a miner's storage price with time. If we didn't have this history information, we only would have the latest known price. (Current index state).
Maybe it's a stretch, and we would never use this information. If that's the case, this feature could be turned off without problems.
This PR also includes an API to provide a basic useful calculator to make deals with miners.
So we're trying to translate real-user domain information into Filecoin values. These calculated values can be presented to the user nicely or leveraged to auto-create CLI commands for Lotus with the correct values (usually a hairy thing to do manually).
Ideally, I'd like to include potential real-time gas fee costs, but I couldn't fit that work on this PR. But that would be a nice addition.
This PR introduces a new daemon
mindexd
responsible for collecting external data (Powergates and Filecoin chain) and building a miner's index. The UI for the Miner's Index is still not defined, which can be understood as the backend for this subsystem.This PR stands on the shoulders of this Powergate PR. Reviewing doesn't need to understand those changes since I tried to be quite verbose in the comments to be self-contained.
I describe big ideas in different sections, but I give more details in PR comments!
Since this is a new daemon, all this work is independent of other daemons
hubd
,userd
, etc. The idea of this daemon is not to have a publicly accessible API, so whenever the UI is defined, we need to route some needed APIs in some publicly facing daemon to call APIs of this one.The miner's index is a mixture of:
The main idea is to gather all this data, and build a useful miner index with it.
All the Powergate related data about importing records, is done in two layers.
The first layer, is a
collector
component which asks registered Powergate instances to provide all created/modified storage/retrieval records that have happened since time X. This delta-style importing allows to poll in an efficient way with a small bandwidth cost.All imported records are merged in a single collection, thus having a complete view of all deals and retrievals made by multiple Powergates. This will be the source layer from which construct more meaningful metrics for the index. Also, since we have always this raw records, we can keep experimenting creating multiple different metrics.
In summary, the
collector
maintains an up-to-date collection of all the deals and retrieval information from multiple Powergates. To quickly see what these records store, see this model.On top of the previous one, the second layer is the
indexer
component, which will build more meaningful metrics. For example:Record.DataSize / (Record.DataTransferEnd - Record.DataTransferStart)
Failed = true
, and get the records date.All this system allows importing Powergate external information and process it to generate Textile related metrics.
It is worth noting that every imported data from a Powergate instance is tagged with a
Region
field. Metrics can be created per region since it wouldn't be fair to mix metrics from Powergate in the USA and Powergate in China, with some miners in China. Most probably, the Chinese miner will have better metrics for Chinese Powergates.Regarding on-chain metrics, such as miner's:
The
indexer
leverages the Powergate stack from the Hub and the newly created Indices APIs to fetch this information from Powergates indices.The full model (with on-chain and Textile data) can be found here. This is the result that
indexer
is keeping up to date.With a specified (tunable) frequency, the
indexer
will recalculate values for the index to consider new data from on-chain Powergate indices or newly imported records from external Powergates.It also makes a daily (could be tunable) snapshot of the Index data in a separate history collection to avoid bloating the current index collection. (Thus, possibly affecting queries and indexes). Having a daily snapshot of the index values could allow building history/plotting/rate metrics. For example, we could consider plotting a miner's storage price with time. If we didn't have this history information, we only would have the latest known price. (Current index state).
Maybe it's a stretch, and we would never use this information. If that's the case, this feature could be turned off without problems.
This PR also includes an API to provide a basic useful calculator to make deals with miners.
Easily explained by the gRPC definition:
So we're trying to translate real-user domain information into Filecoin values. These calculated values can be presented to the user nicely or leveraged to auto-create CLI commands for Lotus with the correct values (usually a hairy thing to do manually).
Ideally, I'd like to include potential real-time gas fee costs, but I couldn't fit that work on this PR. But that would be a nice addition.