anacrolix / dht

dht is used by anacrolix/torrent, and is intended for use as a library in other projects both torrent related and otherwise
Mozilla Public License 2.0
313 stars 66 forks source link

use dht for own storage #1

Closed anacrolix closed 7 years ago

anacrolix commented 7 years ago

From @vtolstov on December 5, 2016 13:41

I need a dht package for own storage system, and as i see godoc.org says that this package have more stars then other. Does it possible to use this package for own program that does not using torrents?

Copied from original issue: anacrolix/torrent#134

anacrolix commented 7 years ago

It certainly is possible. Most dht elements are implemented, though storage has not been. The dht package in this repo is used by several non-torrent projects. Could you describe the use-case for your dht storage needs so that the implementation is added appropriately?

anacrolix commented 7 years ago

From @vtolstov on December 6, 2016 9:13

I have Volume struct that represented via fixed sized blocks (4M) each block have own "id". So i have 5 nodes and need to know on which node i need to place each block for writing and from which nodes i can read. Each "Volume" have replication factor (2, 3, 5) so in case of 2 i need to place it to two nodes, and in case of 5 - on all nodes. So announce i don't need as i understand? Also dht package hardcoded for net.UDPAddr, but i use utp (you package) for transferring blocks (in lan packet miss -> 0)

anacrolix commented 7 years ago

I'm not sure there's a provision for setting a replication factor in the DHT torrent uses, you'd have to add that on top yourself. However for your other purposes, replace the torrent protocol with your own. Peers are endpoints that are dealing with your protocol. Torrent hashes are replaced with the block ids, and node id's are used to assign responsibility for storing peer lists for the blocks. DHT is built on a PacketConn network. You must acknowledge there are technically 2 different networks. DHT/UDP, and yourProto/UTP. It's coincidence the addressing appears the same. Note this is the same with BitTorrent, where it's DHT/UDP and BitTorrent/{utp,tcp}.

You can actually use the existing BitTorrent DHT to arbitrarily start addressing your own data network, because the peer address format is compatible. The network doesn't care.

You do want to use Announce, this is your way of scraping addresses of peers that have a block you want or want a block you have.

So the storage I referred to that is missing, is the storing of peer addresses for blocks near to a nodes own id. https://github.com/anacrolix/torrent/blob/master/dht/server.go#L288 and possibly also extended here https://github.com/anacrolix/torrent/blob/master/dht/server.go#L314.

anacrolix commented 7 years ago

Read http://www.bittorrent.org/beps/bep_0005.html for more, it's not bad.

anacrolix commented 7 years ago

From @vtolstov on December 6, 2016 13:38

Thanks! Also i see https://godoc.org/github.com/anacrolix/torrent/dht/krpc#NodeInfo ipv4 address mention, does it work for ipv6 network in my case?

anacrolix commented 7 years ago

From @vtolstov on December 6, 2016 13:41

You do want to use Announce, this is your way of scraping addresses of peers that have a block you want or want a block you have.

But if i have ability to create new Volume and so i can have after some time new Volume and new blocks for it, does i need to announce each of them?

anacrolix commented 7 years ago

I'm not actually sure how complete IPv6 support is. Mainly because 99% of the DHT network for torrent is still IPv4. Most messages are augmented with an "-6" suffixed key that contains the corresponding reply. I'd appreciate feedback if you find anything missing.

Regards announcing, you announce the top-level data hash. I misread above and assumed your blocks were the highest level object in your network. It would appear that Volumes are. So you announce those.

anacrolix commented 7 years ago

From @vtolstov on December 6, 2016 15:50

Sorry, i don't understand. For example i'm announce volume, but each node may not have full volume blocks, but only some of them.

anacrolix commented 7 years ago

@vtolstov That's fine. But what you're beginning to describe is just BitTorrent. Create a metainfo for each volume. Blocks are just files listed in the metainfo for a volume. The torrent implementation provided in this repo allows you to read blocks on the fly, with no concern for storage, or requiring you to download everything in advance. You can use it for replication easily enough, by making all the peers in your BitTorrent network seed liberally, and monitor one another's pieces, you can determine how many peers have each block.

anacrolix commented 7 years ago

From @vtolstov on December 7, 2016 6:58

Oh fine ! Thanks for suggestions!