Closed kantai closed 6 years ago
Very excited about this!! A couple of notes and thoughts:
This seems pretty self explanatory, basically just the blockstack.js
repo. That could be used for the browser, cli, and any custom implementations devs want. The only question I have is which other piece does this connect to? Just to a blockstackd
node right?
Very excited that we will focus on this as the key part of the stack and dogfooding the heck out of it. This is going to be a huge win for devs who want to do their own on-boarding and other advanced ops.
blockstackd
Also pretty simple. I'm assuming we would be keeping the JSON over RPC interface and just documenting it and creating clients for it.
blockstack-api
and blockstack-resolver
A couple of thoughts here:
blockstackd
as that will be the bottleneck in performance. blockstackd
and then resolves the profile. This would be much lighter weight and not require any storage, but responses would be much slower. I could see usecases for each of these (stateless install for an individual doing dev or supporting his own view of the network, stateful install for applications like blockstack-explorer
). indexer
folder) implemented in a highly parallel, performant manner in go. I understand the desire to keep everything in one language, but considering we already have a 70% complete implementation in another it might be worth at least another set of eyes on the work I've put in: https://github.com/blockstack/go-blockstack/gaia/hub
Excited for more work on this component!! Love these ideas here.
Doing this as comments is annoying. Why not just have a Github wiki for this sort of thing? It gets us the edit history while giving us lower friction than sending PRs. Replying here makes us have to go find the file we want to comment on, switch back and forth between browser tabs, and copy/paste the section we want to comment on in order to provide context.
That said:
gaia/hub
Con of this design: providing local backups for users
I think there's a straightforward solution: make Gaia hubs composible. A Gaia hub can route reads and writes to other Gaia hubs. Then, the user can run a Gaia hub locally and have it replicate to both local disk and to an upstream Gaia hub. The upstream Gaia hub can, in turn, have its own replication policy (example: "send all pictures to Google Drive but keep documents in Dropbox).
Replying to @jackzampolin --
The only question I have is which other piece does this connect to? Just to a blockstackd node right?
blockstack.js
will communicate with gaia
for storage operations, and probably the consumer api endpoints (resolver and the api), rather than blockstackd
directly -- but I could be convinced otherwise on that last point. The rationale for a separate consumer-API is that updating that API can be done relatively frequently, and support a large breadth of versioning, whereas blockstackd
should really only change when the underlying protocol changes (or hotfixes).
(blockstack api and resolver) These should live in the same repo methinks.
I'm not sure -- I think people want custom resolvers, but not necessarily custom other stuff. Of course, this is not that strong of a concern, so they could definitely live in the same repo if necessary.
Does this resolver need to run against the entire network every time it runs, or would it resolve the entire network once on startup and then check for new nameops every ~10 minutes? Would we want to set TTL on profiles so they are getting checked every ~15-30 min? Configurable
I think for setting TTLs we should use the information provided by the standards in place -- our zonefile format sets a TTL for the zonefile entry. And the profiles are themselves fetched over HTTP, which has a cache header. We should use those when determining when to store data locally on the resolver.
There is a version of this API that is stateless as well were every time a call comes in it makes appropriate calls to blockstackd and then resolves the profile.
I strongly prefer starting from a stateless system and seeing how far the performance can be improved by using existing caching tooling. The problem with building up a local index is that it is equivalent to caching, but we'd then be in the business of trying to make sure that we're updating the local state correctly and in a timely manner. If we use existing caching tools (like vanquish, nginx, CDNs), then we don't have to worry about ensuring the correctness of state-transitions (like name transfers, zonefile updates, name expirations, new registrations), which always end up being more complex than we imagine.
Would we be using the API here: https://core.blockstack.org/ or changing that up? Might want to think about writing an OpenAPI spec for this.
It's probably a good starting point. Trim it down and then call it /v1/
forever. Also, we have a Blueprint API Spec (https://github.com/apiaryio/api-blueprint/) for it already -- we can convert that to OpenAPI if you want (though, that would need to update the code to generate https://blockstack.github.io/blockstack-core/ and core.blockstack.org)
I have the resolver portion (need to change the name back, that work is in the indexer folder) implemented in a highly parallel, performant manner in go. I understand the desire to keep everything in one language, but considering we already have a 70% complete implementation in another it might be worth at least another set of eyes on the work I've put in: https://github.com/blockstack/go-blockstack/
Yeah -- I'm fine with implementing the API or the resolver in Go, and what you started is a good starting point. I think the important decisions here are (1) whether or not to separate the resolver from the API and (2) whether or not it should have local state that its managing. I would argue that managing local state should be avoided unless absolutely necessary, because it will ultimately involve re-implementing state-transition logic embedded within blockstackd
, and that is a bad path to go down.
@kantai This is really great. Thanks for writing this up!
blockstack.js
is missing from the list of components.
I'm not really sure what roles 3. blockstack-api or 4. blockstack-resolver play that can't be done client side.
We resolve zone file hashes to profile files in the browser already. Namespace specific behavior seems like a per app problem to solve and out of scope for us.
Versioning of gaiahub and blockstackd interfaces should be their own responsibly -> having said that, both services are super simple so we shouldn't have to change them often.
I think it's important that we remember that identity and authentication of particular applications of blockstackd & gaia storage. An identity search service should live on a higher layer and generate its index based on information in reads from blockstackd and gaia hubs. There will be other search services for other things.
I took a stab at (poorly) drawing how I envision these components interacting:
Let me know if I'm terribly off-base compared to everyone else's understanding.
I do not understand the difference between the resolver API and the Blockstack API. My understanding of the target architecture is:
blockstack.js: Reference client
blockstackd: The Blockstack blockchain reference implementation (moral equivalent to bitcoind)
Blockstack API: The resolver and registrar
Gaia Hub: Read/Write proxy to storage services
After a some out of band discussion with @kantai: the reason we need a resolver component is that resolving subdomains requires parsing the parent domain's zone file and making a number of requests to generate the state of the subdomains. This would impractical to do client side because of the resource intensity.
The way I look at blockstackd
from an app/user/developer perspective is that blockstackd
's main role is to give me a zone file when I give it a name. I propose we move the subdomain indexing functionality into blockstackd
so that it can maintain the index of subdomains and return zone files for subdomains without having to use a separate component.
Other than that, my understanding is generally the same as @jcnelson's, except that I think we can remove blockstack-api & resolver components.
The way I look at blockstackd from an app/user/developer perspective is that blockstackd's main role is to give me a zone file when I give it a name. I propose we move the subdomain indexing functionality into blockstackd so that it can maintain the index of subdomains and return zone files for subdomains without having to use a separate component.
+1000
Is it possible to combine Blockstackd, Blockstack API and the resolver? Would there be any reason for anyone to only install blockstackd and not want the resolver and API? And if any of the functionality can be moved to blockstack.js we should.
The way I look at blockstackd from an app/user/developer perspective is that blockstackd's main role is to give me a zone file when I give it a name.
What about the historic operations for a name? Names owned by an address?
I propose we move the subdomain indexing functionality into blockstackd so that it can maintain the index of subdomains and return zone files for subdomains without having to use a separate component.
Yes, that's fine. If someone wants custom subdomain resolution, they can always implement that separately.
Is it possible to combine Blockstackd, Blockstack API and the resolver? Would there be any reason for anyone to only install blockstackd and not want the resolver and API?
I think this depends on how much of the functionality of the resolver and api moves to the client. If we can move almost all of it to the client, then blockstackd
is really simple. The idea behind separating them, is that if the api has support for complex queries (number of names in a namespace, number of names on a given blockchain, name history), the demand to change those queries will be high, and blockstackd
should ideally be small and updated very infrequently.
I guess the way I'm thinking about blockstackd
vs blockstack-api
is that the API builds indexes of data that exists in blockstackd
.
What about the historic operations for a name? Names owned by an address?
Yes. Names, addresses, zone files, operations. These are the domain of blockstackd. The point I was trying to make is that I think blockstackd
s scope should end at a zone files. Anything higher level, profiles, etc can be somewhere else.
The idea behind separating them, is that if the api has support for complex queries (number of names in a namespace, number of names on a given blockchain, name history), the demand to change those queries will be high, and blockstackd should ideally be small and updated very infrequently.
Ahh okay. Now I see the thinking behind separating this out.
@kantai is this where you were thinking blockstack api would sit? between blockstackd
and the outside world?
If that's the case, I like how a simple api would make blockstackd
and our consensus code a lot more approachable by outside developers.
@kantai is this where you were thinking blockstack api would sit? between blockstackd and the outside world?
This is exactly how I imagine this.
Awesome. Sounds like we're nearing consensus.
Maybe we can have Blockstack API be an installable service of blockstackd like insightAPI is to bitcore-node
Please no. That can be the way it works in practice, but building them as decoupled services will help us debug issues and make for a more robust and deployable product. We can make easy deployment scripts to spin everything up together, or users can scale each component separately (connect each API to a number of backend blockstackd
instances for better performance). We don't need to have an opinion their deployment strategy.
Maybe we can have Blockstack API be an installable service of blockstackd like insightAPI is to bitcore-node
I like logically separating blockstackd and blockstack api and having a very simple api on blockstackd that blockstack api talks to. I'm wondering if we shouldn't always distribute them together. Is there any instance where someone would want to run blockstackd
without blockstock api?
That can be the way it works in practice, but building them as decoupled services will help us debug issues and make for a more robust and deployable product. We can make easy deployment scripts to spin everything up together, or users can scale each component separately (connect each API to a number of backend blockstackd instances for better performance). We don't need to have an opinion their deployment strategy.
Agreed -- they should be developed as decoupled services. Deployment scripts can take care of running them coupled.
I like logically separating blockstackd and blockstack api and having a very simple api on blockstackd that blockstack api talks to. I'm wondering if we shouldn't always distribute them together. Is there any instance where someone would want to run blockstackd without blockstock api?
I don't think there's instances where someone would want to run blockstackd with blockstack-api, but I'm with @jackzampolin that they should be able to be run separately, which avoids us making people's deployment/scaling decisions for them. We should probably target our default deployment scripts/documentation at running them together though (this decision can also be revisited down the line).
I'm with @jackzampolin that they should be able to be run separately, which avoids us making people's deployment/scaling decisions for them.
Makes sense to me.
It sounds to me like the Blockstack API is just a service for querying profiles. If you don't care about profiles, then you don't need the Blockstack API service.
That said, blockstackd
is useful by itself. I use it all the time to check to see if transactions go through, and to query name history. It's actually pretty lightweight--you could run it on your home router.
I think of blockstackd
as something akin to a DNS/CA server, and the Blockstack API service as something akin to a Web server. Naming and PKI are wholly separate concerns from app data hosting and profile indexing.
This is sounding a lot like a consensus!!!
We can make easy deployment scripts to spin everything up together, or users can scale each component separately (connect each API to a number of backend blockstackd instances for better performance). We don't need to have an opinion their deployment strategy.
Do we foresee the need to be able to scale a blockstackd and API to a number of instances? What if by default you can easily install the API as a service from blockstackd and give the option to for standalone API instances to connect to blockstackd.
That said, blockstackd is useful by itself. I use it all the time to check to see if transactions go through, and to query name history. It's actually pretty lightweight--you could run it on your home router.
@jcnelson -- the idea behind interposing blockstack-api
between clients and blockstackd
is that blockstackd
should be a minimal codebase that receives minimal amounts of updates. When clients interact directly with an API, that creates pressure to make updates and changes to that API, and we want to minimize the amount of updates to blockstackd
-- ideally that's just consensus-breaking changes and hotfixes for bugs (not additional API features).
I like it. blockstack-api
is the libc
to the kernel that is blockstackd
.
It sounds to me like the Blockstack API is just a service for querying profiles. If you don't care about profiles, then you don't need the Blockstack API service.
I don't think blockstack api should have anything to do with profiles. profiles and identity are an application on top of the naming system. (see the pink line in my ugly drawing below)
Profile look up can be done in blockstack.js.
I don't think we should encourage clients in general to interact directly with blockstackd. The only client should of blockstackd should be 1 or more blockstack-api instances.
Then I do not understand what the blockstack-api
provides besides application compatibility with different blockstackd
versions?
Blockstack Api should resolve profiles. Its creating search indexes.
Let me ask this another way. If we can keep the blockstackd
API stable and agreed-upon, do we need a separate blockstack-api
service? If so, what does that service provide that is outside the scope of both blockstack.js
and blocksatckd
?
@jackzampolin @larrysalibra I'm getting conflicting signals. Profile resolution happens in blockstack.js
, right?
If all blockstack-api
does is provide a search index over profiles, can we explicitly narrow this component's scope? Maybe by calling it something more specific, like blockstack-search
or blockstack-explorer
?
The way I look at it is that you should be able to do all of it in the browser if you want, but a large number of applications will want a server side component to speed up some of the operations that would take a bunch of network hops to complete. To do a profile resolution by just talking to a core node you need to make the following calls:
- RPC get_name_blockchain_record
- RPC get_zonefiles
- fetch profile from zonefile
On a bad network that can add significant latency that would make apps almost unusable under anything but ideal network conditions. We are going to want some sort of server side component to make that data easier to get to, fewer hops and a faster interface.
If all blockstack-api does is provide a search index over profiles, can we explicitly narrow this component's scope? Maybe by calling it something more specific, like blockstack-search or blockstack-explorer?
Okay -- to recap, I think there's still two issues here on blockstack-api
--
blockstackd
to interpose on all client requests?blockstack-api
or on a client?For (1), the reason to interpose on all client requests is exactly what @jcnelson mentioned before -- support for new shiny API features without updating the important kernel (the *nix analogy is libc:kernel)
For (2), we still need to make those decisions.
Re (2), I believe search is a separate problem from indexing. I don't need a search index to do lookups, just like how I don't need Google to run DNS queries.
@jackzampolin the first two calls (get_name_blockchain_record
and get_zonefiles
) can be routed to a local blockstackd
. I do this on my laptop, for example. This is both faster and more secure than trusting a 3rd party instance to do this on my behalf, since the profile data is authenticated with the blockchain and zone file data.
Recapping a consensus forming meeting off GitHub. We formed a consensus around a lightweight shim API as diagrammed in Larry's last component drawing. And I believe we are going with implementing the resolver logic in the client (blockstack.js).
On Dec 7, 2017 6:19 PM, "Jude Nelson" notifications@github.com wrote:
Re (2), I believe search is a separate problem from indexing. I don't need a search index to do lookups, just like how I don't need Google to run DNS queries.
@jackzampolin https://github.com/jackzampolin the first two calls ( get_name_blockchain_record and get_zonefiles) can be routed to a local blockstackd. I do this on my laptop, for example. This is both faster and more secure than trusting a 3rd party instance to do this on my behalf, since the profile data is authenticated with the blockchain and zone file data.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/blockstack/blockstack/issues/376#issuecomment-350124887, or mute the thread https://github.com/notifications/unsubscribe-auth/ABGaEAtQcaqaC9p8AkzjBIPya2azTEX7ks5s-HKXgaJpZM4Q6AS6 .
This discussion has concluded and we are making the changes described here. Closing.
Based on conversations online and offline with various team members at Blockstack and developers looking to build on Blockstack (and users of the Blockstack software), I spent some time trying to sketch out a near-term architecture of the various Blockstack components and their interactions, with an eye towards trying to (1) improve the stability of the software and APIs Blockstack provides and (2) reduce the complexity of dependencies between components. In my mind, each of these components would be a different "project" of Blockstack -- each component has a different version number, github repo, test cases, and build process (to the extent that unifying technologies can be used in multiple components, we should do that, but we should still be able to conceptualize the components as independent actors).
I added a markdown file to this branch:
https://github.com/blockstack/blockstack/blob/future-architecture/future-architecture.md
Anyways, I'm going to tag the various stakeholders on the team for feedback and discussion @jcnelson @jackzampolin @larrysalibra @muneeb-ali @shea256 @yknl -- I think we should discuss here and then try to merge changes and comments into the markdown file.