We've chosen to implement a distributed protocol in the genomics API that allows peers to make ad-hoc networks.
One might implement consider a simple peer discovery protocol instead of focusing on the beacon-network. This would allow a beacon-of-beacons to construct its index by crawling the distributed network of beacons, instead of waiting for folks to submit their beacon to a service.
Registering a beacon to the network could be as simple as sending a packet to a known beacon, like on 1kgenomes. This would avoid the need for a beacon of beacons. When a peer receives an announce packet, it can choose to add that peer and/or announce its existence to the network.
Although I would caution against it, one could replicate the current network architecture using the same mechanisms. You could implement your peer protocol such that you add a center "broadcast" peer that lists the peers for some beacon "sub-network".
In principle, we want the network topology to first find a healthy decentralized model. Then institutions and platforms build over the top with other software. I expect a beacon metadata-indexer to be a valuable tool, where beacons are crawled and their data are indexed in some human queryable format. That is the value of the beacon-network software in my mind, but the actual beacons themselves would be queried directly, this is the spirit of the decentralized beacon.
What is the current mechanism for adding one's self to the beacon network? What lessons have been learned in the submission process?
@david4096 This is exactly the kind of model that has emerged from our experience in building the Beacon Network. We are very seriously considering possible models for allowing organic growth of a decentralized registry, and appreciate these points. We will be meeting by teleconference to brainstorm about this (date/time TBD). Could you please join?
As mentioned during the Beacon call, in the Security + Beacon meeting at Hinxton, different models were discussed taking into account security aspects like trying to get info about a sample by spreading the query among all beacons hosting such datasets...
At first sight I don't see why we can't describe different models and allow each network to peek and implement whichever matches their specific use case.
The peer network mentioned by @david4096 above is described in the document linked below. There are some links at the bottom to papers on building this sort of decentralized federation. This type of network is called a "Gossip-based protocol". Our peer service provides the protocol level building block for creating this sort of network. https://docs.google.com/document/d/1hc-l7P0S0G8j19n0dKV9e0AeF8zgE3I88fNveSC178w/edit#heading=h.1j4vc4ls6v7v
(You will probably need to request access to the doc)
Thanks @kozbo. We're anticipating needing to implement systems for: completely open networks and networks that require permission to join. Analogy is .com vs. .gov domain names, the latter of which cannot be reserved by just anyone. Is there an additional level on top of this protocol for these to be enforced? Were other models (like hub and spoke registries like DNS) considered?
Unfortunately there has been lots of discussion about this topic which is not in this thread. I encourage everyone to discuss Beacon registery requirements in this ticket.
In my understanding current consensus is that in ELIXIR requirements are very clear: we need to be able to curate the Registery and we need this soon. For this reason we have agreed to implement new /peers endpoint which just serves a static file containing Beacons in following format:
Later on this implementation can be extented to support P2P network so that each Beacon crawls the Beacons in /peers endpoint and updates the endpoint.
What do you think about including a protocol version string and protocol type field along with the URL for each record? I think that adding those fields should make the peer list call be able to work for both Beacons and GA4GH servers too.
I can see a case to say that these should be part of a subsequent info query request response but my counter is that the client would then be very chatty. It could get the initial list but then it would need to ping each server on the list to find out if the client code running will work with any of the servers reported in the peer list. The initial information on the type of server and the version running will help any client quickly narrow the field to servers of interest.
@jrambla commented on Tue Feb 07 2017
@david4096 commented on Tue Feb 07 2017
We've chosen to implement a distributed protocol in the genomics API that allows peers to make ad-hoc networks.
One might implement consider a simple peer discovery protocol instead of focusing on the beacon-network. This would allow a beacon-of-beacons to construct its index by crawling the distributed network of beacons, instead of waiting for folks to submit their beacon to a service.
Registering a beacon to the network could be as simple as sending a packet to a known beacon, like on 1kgenomes. This would avoid the need for a beacon of beacons. When a peer receives an announce packet, it can choose to add that peer and/or announce its existence to the network.
Although I would caution against it, one could replicate the current network architecture using the same mechanisms. You could implement your peer protocol such that you add a center "broadcast" peer that lists the peers for some beacon "sub-network".
In principle, we want the network topology to first find a healthy decentralized model. Then institutions and platforms build over the top with other software. I expect a beacon metadata-indexer to be a valuable tool, where beacons are crawled and their data are indexed in some human queryable format. That is the value of the beacon-network software in my mind, but the actual beacons themselves would be queried directly, this is the spirit of the decentralized beacon.
What is the current mechanism for adding one's self to the beacon network? What lessons have been learned in the submission process?
@mfiume commented on Tue Feb 07 2017
@david4096 This is exactly the kind of model that has emerged from our experience in building the Beacon Network. We are very seriously considering possible models for allowing organic growth of a decentralized registry, and appreciate these points. We will be meeting by teleconference to brainstorm about this (date/time TBD). Could you please join?
@jrambla commented on Tue Feb 07 2017
As mentioned during the Beacon call, in the Security + Beacon meeting at Hinxton, different models were discussed taking into account security aspects like trying to get info about a sample by spreading the query among all beacons hosting such datasets... At first sight I don't see why we can't describe different models and allow each network to peek and implement whichever matches their specific use case.
@kozbo commented on Tue Feb 07 2017
The peer network mentioned by @david4096 above is described in the document linked below. There are some links at the bottom to papers on building this sort of decentralized federation. This type of network is called a "Gossip-based protocol". Our peer service provides the protocol level building block for creating this sort of network. https://docs.google.com/document/d/1hc-l7P0S0G8j19n0dKV9e0AeF8zgE3I88fNveSC178w/edit#heading=h.1j4vc4ls6v7v (You will probably need to request access to the doc)
@mfiume commented on Tue Feb 07 2017
Thanks @kozbo. We're anticipating needing to implement systems for: completely open networks and networks that require permission to join. Analogy is .com vs. .gov domain names, the latter of which cannot be reserved by just anyone. Is there an additional level on top of this protocol for these to be enforced? Were other models (like hub and spoke registries like DNS) considered?
@mfiume commented on Tue Feb 07 2017
Encourage anyone interested in this discussion in the context of Beacon to join us on an upcoming call being scheduled.
Doodle: http://doodle.com/poll/ux5rtatz5f69ki24psh95n8v/admin
@juhtornr commented on Fri Mar 24 2017
Unfortunately there has been lots of discussion about this topic which is not in this thread. I encourage everyone to discuss Beacon registery requirements in this ticket.
In my understanding current consensus is that in ELIXIR requirements are very clear: we need to be able to curate the Registery and we need this soon. For this reason we have agreed to implement new /peers endpoint which just serves a static file containing Beacons in following format:
{ 'url': 'http://beacon1/, 'url': 'http://beacon2/', ... }
Later on this implementation can be extented to support P2P network so that each Beacon crawls the Beacons in /peers endpoint and updates the endpoint.
@kozbo commented on Tue Mar 28 2017
What do you think about including a protocol version string and protocol type field along with the URL for each record? I think that adding those fields should make the peer list call be able to work for both Beacons and GA4GH servers too.
I can see a case to say that these should be part of a subsequent info query request response but my counter is that the client would then be very chatty. It could get the initial list but then it would need to ping each server on the list to find out if the client code running will work with any of the servers reported in the peer list. The initial information on the type of server and the version running will help any client quickly narrow the field to servers of interest.
@juhtornr commented on Tue Mar 28 2017
I'm not against version/type strings in /peers endpoint response but there's potential issue if version string is not up to date.