Closed liketurbo closed 2 years ago
I'm ready to implement this project myself in 6 weeks after the project is approved.
I think that for 90% of projects it's OK to go with HTTP API approach, besides that you can run your own copy of toncenter if public one is not enough for you.
The second option's major drawback is that you need to trust the Liteserver you're connecting to.
That's not true if you use tonlib it checks merkle proofs of responses from liteservers. However some other methods of interacting with litestervers such as ton-lite-client for Node.js is not checking proofs yet and it would be better to put effort on fixing that.
The second option's major drawback is that you need to trust the Liteserver you're connecting to.
@Naltox By that I meant the case when Liteserver goes down and your app depends on it.
...besides that you can run your own copy of toncenter if public one is not enough for you
@Naltox Yes you can, but that approach has some hardware requirements
I think that for 90% of projects it's OK to go with HTTP API approach...
I don't know how got 90%, but for my recent project it was not enough and I always hit the rate limit.
I think that for 90% of projects it's OK to go with HTTP API approach...
I don't know how got 90%, but for my recent project it was not enough and I always hit the rate limit.
I'd rather pay for AWS S3 and have production ready app, than handling rate limit (and issues that comes with that) and explaining to my client why the app has some certain limitations.
I think that the project definition is very unclear and assumes that we know what elements the NEAR Lake Indexer is indexing. We don't and I don't think NEAR is close enough to TON that the comparison is useful.
What I don’t understand is what sort of data is being indexed. Is it just taking existing data that TON full nodes are already indexing (like block headers) and making this available?
Or is it indexing new connections, for example which Jettons/NFT are held by which account? This is a much more difficult task since this data is not indexed "natively" by a full node and requires to parse all blocks during sync to create new indexes from scratch.
The former is easy, I think we have enough solutions in the ecosystem for this (toncenter api / tonhub v4 / raw adnl). The latter is harder but needs to specify exactly which new connections it plans to index. Would you be able to query all messages sent to a smart contract? Will you be able to query all messages sent by a specific address to a smart contract? Will you be able to filter message arguments? Will you be able to query all jettons a specific user holds? Will you be able to query all the holders of a specific jetton?
Each one of these queries is a totally different index. Will users be able to define their own queries or are these indexes hard-coded? Some of these will be very expensive to hold, who will pay for that? Where will the index run? Do end-users need to index the whole chain from scratch by themselves? or is there a service provider that ran the index for the benefit of everybody and allows people to use it?
Also for the second there are many solutions that already exist in the ecosystem, not necessarily all open source. TonApi has tons of indexes, Disintar have their own indexer now open source with Kafka, TonHub have their own closed source indexer, and getgems I assume are also indexing tons of extra stuff about NFTs
@talkol
That's a lot of questions 😂
I think if you follow the links I provided you will find the answers to most of them if not to all of them. Even if it's a different ecosystem the conceptual idea is the same.
And please let me know - how do you think we can improve the project definition. I'm open to suggestions.
I think sending to external links that each have a few dozens pages of documentation inside is a bit of a hassle. Can you give a TLDR overview of how it works within the contents of this footstep? (1 pager) It will help readers like myself
@talkol Please, check out my overview draft. If it looks great for the first reader as yourself I'll add it to the footstep definition.
What I'm trying to do is to create a library that would allow to connect to the blockchain (Liteserver) and store the data in AWS S3. And create a library that would allow to connect to the AWS S3 and fetch the data.
We are storing raw data from the blockchain in AWS S3, presumably blocks of transactions, messages, etc. with TON Lake Indexer.
We are fetching raw data from AWS S3 with TON Lake Framework. And because you have access to the stream object you can fetch whole blockchain, parse it and build some kind of explorer. Or you can parse latest blocks and build some kind of subscriber upon it.
What's differ it from other similar projects like toncenter api / tonhub v4 / raw adnl is reliability, availability and objectively flexibility cause API will consist from a just one function - startStream
which will return stream object and you can fetch/parse/store raw data how you want it.
And with Requester Pays option cost would be fixated for the provider and end users by would pay only for data they used.
Also it creates possibility for TON Foundation to host TON Lake Indexer because the cost will not increase with the number of users and they only have to pay for data write.
Well, I guess this footstep it's not as relevant as I though it would be 🤷♂️ Closing it for now
Summary
Library that would be essential for TON projects that need production ready solution to index TON blockchain.
Context
When it comes to working with data from the blockchain and you need a some way to store, parse, subscribe to a blocks, a transactions, a messages then your choices are pretty limited.
Here's the ways (at least the ones that I found):
The first option is not really an option for most of the developers. Cause you need to run a full node which you need to maintain and infrastructure costs are not cheap.
The second option's major drawback is that you need to trust the Liteserver you're connecting to. And service provider need to scale their infrastructure to support all the users.
And same goes for the third option. You need to trust the service you're using and service provider need to scale their infrastructure to support all the users.
That kind of problem faced by a team from Pagoda Indexer from NEAR ecosystem.
And they came up with a solution which involves AWS S3 and consists of a pair of libraries: NEAR Lake Indexer and NEAR Lake Framework.
NEAR Lake Indexer is a library that allows you to connect to the blockchain and store the data in AWS S3.
NEAR Lake Framework is a library companion to NEAR Lake Indexer that allows you to connect to the AWS S3 and parse the data.
What got me interested in this solution is that availability is handled by AWS S3 and with Requester Pays option cost would be fixated for the provider and end users by would pay by themselves for the data they're using.
Goals
Deliverables
Definition of Done
[] TON Lake Indexer, which meets the goal requirement and open source
[] TON Lake Framework (Rust version), which meets the goal requirement, open source and crate available on crates.io: Rust Packages Registry
[] TON Lake Framework (JS/TS version), which meets the goal requirement, open source and package available on npm
Reward