EPIC: Indexer API - @taquito/idx

jevonearth commented 4 years ago

Overview

Taquito will provide an indexer-agnostic library for convenient access to indexed Tezos data. Forming a vertically integrated stack is a Taquito anti-goal.

Problem Statement

Presently, Taquito does not provide any client libraries to indexed data. Some of the use cases include historical operations, balance changes, smart contracts indexing, etc.
There are at least three dominant Tezos Indexers in the Tezos space. Each with its own API specification. Users need to choose one over the other, and the choice leads to tight coupling between the user's product and their chosen Indexer. Taquito's indexer-agnostic library can give Taquito users the ability to switch between different indexers seamlessly and with the minimum development effort.

Why Indexers are needed?

A Blockchain is a linked list of blocks that contain operations. This data structure is optimized for its purpose as a blockchain, a distributed ledger of operations with strong correctness guarantees.

A linked list of blocks containing operations does not lend itself to querying & reporting of data in the blockchain.

A Blockchain indexer reads data from the blockchain, starting at block zero and progressing to the latest "head" block. At each block, the Indexer extracts the block's "operations," transforms it into a new data structure, and writes that data to a data store (typically RDBMS or a Column-orientated DBMS). Blockchain Indexers typically provide a web API from which users can query transaction data from the blockchain efficiently.

The Tezos indexers each offer their APIs, and each Indexer has offers common features and features unique to that Indexer.

Why does the Tezos Node not provide basic indexing features?

Putting indexing features in the Tezos node would bloat the Tezos Node and is not an attractive proposition.

Prominent Tezos Indexers

Links to the API docs of existing Tezos indexers https://tzstats.com/docs/api/index.html https://api.tzkt.io/ https://github.com/Cryptonomic/Conseil/wiki/REST-API https://tezgraph.com/ / https://gitlab.com/nomadic-labs/tezos-indexer

"TZScan" was the first Tezos Indexer and had considerable market share as well as many dependent projects. The authors of the TZScan Indexer decided to leave the ecosystem. In doing so, they shut down their Indexer, leaving dependent projects scrambling to find new sources of indexed data.

Nomadic's "tezos-indexer" is an ETL daemon that indexes the Tezos blockchain into a Postgresql database. There's an effort named TezGraph to build a GrapQL API on top of this database.

Proposed Solution

Offer a new Taquito package named @taquito/idx that offers the most common requirements for indexed data. The idx package is an interface with individual providers for each indexer API. Similar to the Taquito Wallet API (Beacon, TezBridge) or the Signer API (InMemory, Remote, TezBridge)

A user of Taquito can use a single query API and choose the backend indexer API that best suits their needs and are most comfortable with it.

IDX Diagrams - getOperation

Exposed Data

NTD: Incomplete, needs further definition and validation against each Indexer

Query for operation using operation hash
Query for operations by address
- Filter results by kind
- Filter results by sent/received/other
Query Contract bigmap for keys
Query Contract bigmap for values
- Filter by key[s]

Implementation Considerations

Should Taquito house each indexer provider within the Taquito git repository?

No. Indexer APIs can change rapidly, and that change is outside the control of the Taquito maintainers. We plan to host indexer plugins that we participate in maintaining as separate git repositories. This will decouple the release cycles of Taquito from the release cycles of Indexer API vendors.

How to handle unsupported data If an upstream indexer does not support a specific data set, an E_NOT_IMPLEMENTED error will be returned to the user. Big Maps is an area where this scenario may arise as not all indexers support big maps yet.

Interface Design

ToDo:

Define types for operations that cover as much data that all the indexers commonly support. (Commonly in the sense that the data properties are common to all indexers and not unique to one)
- See TezGraph's operations: https://tezgraph.com/architecture/2_information-model
Make a mapping table of data naming from each Indexer to our proposed naming
Identify any problematic type transformations
Investigate number encoding from each Indexer
Create a test harness and target specific operations/blocks for retrieval. Retrieve identical op from all indexers and compare the local instance of data. Look for differences

Provider Implementation Considerations

A provider should be instantiated and injected into the @taquito/idx instance. At instantiation, the provider instance may require custom configuration, an API key or a correlation ID for distributed tracing considerations.

Testing

Ongoing testing

Taquito plans to set up a continuous testing harness that runs the indexer queries against all supported Indexer providers. This level of testing aims to discover & detect regressions in the client libraries and the upstream Indexer APIs early.

NTD: Discuss the possibility of collaboration and alerting with indexer vendors. Discuss the labour commitment in maintaining this idea.

Indexer Standardization / Implementation opportunities

By implementing a common, albeit narrow API for multiple vendors, we will be challenged to see that each Indexer provides consistent data facts. We also face risk on data transcoding where the shape or representation from an indexer needs to be coerced into a common data type returned from the query API.

This challenge offers a research and testing opportunity to see that the various indexers represent data accurate to the blockchain's expected representation.

Rejected Ideas

Here we summarise alternative approaches that we have abandoned or rejected.

Choose one Indexer and ship a client for that Indexer

Coupling to one Indexer would be technically more straightforward and expose all data from that source. However, we believe it is a worthy goal not too tightly coupled to upstream projects. Taquito works with the Tezos blockchain from the "bottom-up," and when it can't, in this case, it must rely on indexers. So we are choosing to offer a narrower set of data while maintaining freedom of choice for Taquito users.

oleiba commented 4 years ago

Great! Just to be a bit more specific, ideally we would want all the relevant fields as we get from an operation when we send a transaction, i.e.:

hash
fee
value
data
gasLimit, gasPrice
source
destination
counter
blockheight, blockhash.

We would want the history to include both unconfirmed and confirmed transactions (distinguished by the unconfirmed txs having default block values). Lastly, this object can be reused also for streaming events for new transactions (https://github.com/ecadlabs/taquito/issues/186).

Innkst commented 3 years ago

Link to feature: https://ecadlabs.productboard.com/feature-board/planning/features/6651934

Innkst commented 3 years ago

Spike on getting an Operation: #705 Spike on getting a Bigmap: #706

zamrokk commented 2 years ago

will it be possible to get all contracts from a given address ?

ex : https://hangzhou2net.tzkt.io/tz1VApBuWHuaTfDHtKzU3NBtWFYsxJvvWhYk/contracts

jevonearth commented 2 years ago

Hi @zamrokk,

From the link, I understand you want to get a list of contract orientations that came from a particular account. If I'm mistaken, please do let me know. We will add this to our feature list, and see what we can do for this.

Thank you!

zamrokk commented 2 years ago

Also they have another API that scan contracts that have similar code. It is useful when deploy several contracts from same template and list all

BearCooder commented 1 year ago

Hi @jevonearth can you tell what the status of this is? Thanks!

zamrokk commented 1 year ago

I use mainly tkzt lib

ecadlabs / taquito