pubky / mainline

Simple, robust, BitTorrent's Mainline DHT implementation
MIT License
30 stars 6 forks source link

PANIC reasoning upon second put query. #19

Closed RealAstolfo closed 3 months ago

RealAstolfo commented 3 months ago

Hi there, appreciate the work you have done so far!

I was wondering the reasoning for a hard panic within impl PutQuery's if !self.inflight_requests.is_empty() check within src/rpc/query.rs

is this to be compliant to a standard? perhaps a temporary before its adapted to return a Result<, > ? or is it intentional?

i bring this up because of the multiple (yet random) occasions where if i put a mutable item into the DHT, a peer will never find it unless i subsequently run an announce_peer, which runs a random chance of panicing on the double PutQuery.

If you did not intend this behaviour, im willing to write up a PR adapting the function to instead return a Result<, > so you retain the error handling while not forcing programs upstreams to quit (on my fork i simply have a return right now for lazyness)

Nuhvi commented 3 months ago

Hi @RealAstolfo the reason of that panic, is that I never expected someone would need to make an announce_peer on the same target as a mutable put, so while this check works for mutable put queries, evidently it doesn't work for your case.

Now, I should handle the case of announce_peer on the same target, as rare as it might be. But more importantly, I should find out why did you need to do that in the first place.

if i put a mutable item into the DHT, a peer will never find it unless i subsequently run an announce_peer

Need to dig deeper into this, does the lookup peer actually does the lookup AFTER the first peer is done with the put query? or is there a race condition happening here?

RealAstolfo commented 3 months ago

im not actually sure why myself. but when i would test Node A submitting a mutable item, Node B was never able to find Node A from the public key when using get_peers.

ex, Node A

let now = SystemTime::now().duration_since(SystemTime::UNIX_EPOCH).expect("time drift").as_micros() as i64;
let item = MutableItem::new(private_key, "Hi :)".into(), now, None);
dht.put_mutable(item).await.expect("Could not insert mutable");

ex, Node B

    loop {
        let option = dht
            .get_peers(id)
            .expect("Unable to find peers for Id.")
            .next()
            .await;
        println!("DEBUG: peer_vec {:?}", option);
        if option.is_some() {
            let peer_vec = option.unwrap();
            if peer_vec.len() > 0 {
                for candidate in peer_vec {
                    if external_ip.ip() != candidate.ip() {
                        println!(
                            "DEBUG: external {:?}, candidate {:?}",
                            external_ip.ip(),
                            candidate.ip()
                        );
                        return candidate;
                    }

                    if external_ip.port() != candidate.port() {
                        return candidate;
                    }
                }
            }
        }
    }

in the top example, ive observed occasionally, Node B will loop here forever.

however, if i change the code in Node A to the following

let now = SystemTime::now().duration_since(SystemTime::UNIX_EPOCH).expect("time drift").as_micros() as i64;
let item = MutableItem::new(private_key, "Hi :)".into(), now, None);
dht.put_mutable(item).await.expect("Could not insert mutable");
loop {
   match dht.lock().await.announce_peer(target_id, None).await {
      Ok(_) => {
         println!("Successful announce");
      }
      Err(e) => {
         // this was when i discovered the panic!
         println!("Unsuccessful announce: {}", e.to_string);
      }
   }
   tokio::time::sleep(Duration::from_secs(5)).await;
}

eventually Node B will find node A

Nuhvi commented 3 months ago

@RealAstolfo First, you shouldn't use get_peers you should use get_mutable, if you want to use BEP_0044, if you instead want to use BEP_0005 then call announce_peer and get_peers.

Secondly, you shouldn't put a mutex around the dht client, it is already safe to clone across threads.

RealAstolfo commented 3 months ago

are you not allowed to treat the public key of a mutable item as an info hash? thats odd considering that it actually works on the real DHT...

Nuhvi commented 3 months ago

The hash of the public key is an info hash, but when you call get_peers I the RPC will expect the incoming message to be a list of peers, if it finds mutable item instead, it will discard it. Not sure what is the "real DHT" you are referring to, but I won't complicate the return type of get_peers or get_mutable unless there is a very strong reason why you would want get_peers to return mutable items too.

RealAstolfo commented 3 months ago

so the snippet probably didnt show enough. i have Node A creating and then putting the mutable item into the network.

then Node B will try to get_mutable of that item, then try to get the peer's that have that item, but Node B occasionally would never find the peers for the mutable item it just received without the announce_peer on Node A,

Nuhvi commented 3 months ago

@RealAstolfo it doesn't work like that. I think you are confusing multiple concepts.

  1. Mutable Item is stored in a "Routing node"
  2. Routing node is a DHT node that is publicly accessible and is not announcing itself as read only
  3. A "peer" is not a routing node, a peer is just an IP address advertising that it is hosting a torrent file corresponding to an info_hash.

So when you find a mutable item, you will find it from one or more routing nodes, but these are not the same things that you get when you lookup get_peers for the same info hash.

Try to read the BEPs again https://www.bittorrent.org/beps/bep_0005.html and https://www.bittorrent.org/beps/bep_0044.html

RealAstolfo commented 3 months ago

OOOOOohh, i think i understand now. if i want the socket address information of a client hosting a mutable item, i must

get_mutable, then use the MutableItem's target() to query the network for the peer with get_peer? or am i still missing something?

edit: now that i think of it i cant see why this does not work. just seems rather strange that getting the socketaddr of a serving node is not straight forward.

Nuhvi commented 3 months ago

@RealAstolfo No, it is actually the from here ... and I don't think there is anyway for you to access it as a user of the Dht struct.

Remember, there isn't just one from there are multiples as multiple routing nodes responds with the same mutable item.

Depending on what you actually need this for, the simplest solution might be a method for finding the closest routing nodes to a given info hash.

RealAstolfo commented 3 months ago

i see, then i suppose generating an immutable containing data referencing to a mutable item's public key would be the best way to get a mutable item with a socket address (or addresses). thanks for the info