Closed dan-da closed 1 year ago
@steinerkelvin nice. yeah I wanted to display a real error message, but discovered that Err was empty, so I just left it as-is.
One thing I noticed is this doesn't ensure the main API host is included. Hum.
One todo: the --hosts
parameter.
I named the param --host because latest clap seems to strongly prefer (maybe require?) that Vec<> args are passed as --param <item1> --param <item2>
. Instead of --param item1 item2
as structopt accepted. I thought that --hosts x.x.x.x:80 --hosts:y.y.y.y:80
seems wrong and --host x.x.x.x:80 --host:y.y.y.y:80
is better. Also one can just use -h
.
One thing I noticed is this doesn't ensure the main API host is included. Hum.
I initially appended the --api host to the peer list. However, I found it was duplicated because the call to client.getpeers was returning it, so I dropped that code.
Now after I pulled the latest commits to this branch, I find that kindelia get peers
only works once after the node starts up, and after it is always an empty list. So I guess something has changed in that area...?
I initially appended the --api host to the peer list. However, I found it was duplicated because the call to client.getpeers was returning it, so I dropped that code.
We could just add if it's missing?
Now after I pulled the latest commits to this branch, I find that kindelia get peers only works once after the node starts up, and after it is always an empty list. So I guess something has changed in that area...?
Yeah, you just stumbled in the exact case in that this would be relevant: your own node isn't in the peers' list, because other you don't have other peers broadcasting that address to yourself. On a side note, I wonder if a node could be aware an address points to itself without hardcoding it.
I've bumped the network_id
on dev
to 0xCAFE0005
which is now what is running on the cloud. I had to reset that test chain because santi added code to the genesis block and that changed it's hash, which invalidates the entire chain. I wonder how we can better coordinate on that. I guess having an endpoint in which the nodes report the network id and genesis block hash they are using would help:
http https://node-ams3.testnet.kindelia.org:8000/info
{
"network_id": "0xCAFE0005",
"genesis_hash": "0xa204e26d20450f127dc16c9c7c0985f2e664afef8a2b223041f04c33df2e5b6a",
}
also, maybe a dedicated text channel for these updates / resets.
We could just add if it's missing?
done.
--hosts CLI parameter to override to which hosts to publish
implemented.
This PR is feature complete now afaict and ready for review.
One thing that still bothers me is that I see different behavior when publishing dup transactions to my local node vs remote nodes.
1st run, no dups:
$ kindelia publish ../example/example.kdl
Transaction #0: PUBLISHED to http://localhost:8000/ (tx added to mempool)
NOT PUBLISHED to http://12.88.153.42/. (Error 501 Not Implemented: <html>
<head><title>Error 501: Not Implemented</title></head>
<body>
<h1>Error 501: Not Implemented</h1>
</body>
</html>
)
Transaction #0: PUBLISHED to http://64.227.110.69/ (tx added to mempool)
NOT PUBLISHED to http://34.125.71.252/. (Error 404 Not Found: 404 page not found
)
NOT PUBLISHED to http://104.155.237.82/. (Error 404 Not Found: 404 page not found
)
Transaction #0: PUBLISHED to http://188.166.3.140/ (tx added to mempool)
2nd run, tx #0 is a dup:
$ kindelia publish ../example/example.kdl
Transaction #0: NOT PUBLISHED to http://localhost:8000/: Transaction 0x574adb292b12fc4c1c1ec845c8e79f19c7781ca3bf059172431d0c310de1f30c already included on pool
NOT PUBLISHED to http://12.88.153.42/. (Error 501 Not Implemented: <html>
<head><title>Error 501: Not Implemented</title></head>
<body>
<h1>Error 501: Not Implemented</h1>
</body>
</html>
)
Transaction #0: PUBLISHED to http://64.227.110.69/ (tx added to mempool)
NOT PUBLISHED to http://34.125.71.252/. (Error 404 Not Found: 404 page not found
)
NOT PUBLISHED to http://104.155.237.82/. (Error 404 Not Found: 404 page not found
)
Transaction #0: PUBLISHED to http://188.166.3.140/ (tx added to mempool)
So 127.0.0.1 rejects tx #0 as a dup, but 64.227.110.69 and 188.166.3.140 accept it.
I'm wondering if perhaps those peers are running older code that doesn't do any dup detection?
Running `/home/michiru/Kindelia/target/debug/kindelia publish kinguFun.kdl`
thread 'main' panicked at 'Command publish: Short option names must be unique for each argument, but '-h' is in use by
both 'host' and 'help' (call `cmd.disable_help_flag(true)` to remove the auto-generated `--help`)',
/home/michiru/.cargo/registry/src/github.com-1ecc6299db9ec823/clap-4.0.22/src/builder/debug_asserts.rs:121:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
apparently the -h
flag is conflicting with --help
@kings177 thx. I've removed the publish -h shortopt for now.
It might make sense to think of another shortopt, perhaps renaming --host
to --remote
or something, so we could then use -r
.
--node
and -n
could be another option.
@kings177 thx. I've removed the publish -h shortopt for now.
It might make sense to think of another shortopt, perhaps renaming
--host
to--remote
or something, so we could then use-r
.
--node
and-n
could be another option.
Perhaps it could be --peers
just like the --initial-peers
?
I'm wondering if perhaps those peers are running older code that doesn't do any dup detection?
I will investigate this
Perhaps it could be
--peers
just like the--initial-peers
?
I considered this.
My reasoning is that nodes are peers of eachother, but the cli is a client of a node... often the localhost node but potentially any node. So they are not peers to the cli. (except in the context of node start
, wherein the cli becomes a node).
stated differently, --initial-peers
is in the context of node start
, but --host
is in the context of publish
from cli to a node.
So that's why I thought maybe --node
or --remote
instead.
Just figured I should share my thought process.
Also, regarding plural vs singular eg --peers
vs --peer
see this comment.
Please let me know if you still want --peers
or something else.
I'm wondering if perhaps those peers are running older code that doesn't do any dup detection?
I will investigate this
From what I saw the nodes are doing this detection. The code that makes it (in the add_transaction
function, in the node.rs
file), has been inserted in the repository for a long time, and I tested using the current publish
command, publishing the same Statement
, and I got the expected error (transaction already included).
What may have happened is that nodes other than localhost have already mined its transactions and therefore removed this transaction from the mempool, thus allowing their reinsertion.
My reasoning is that nodes are peers of eachother, but the cli is a client of a node... often the localhost node but potentially any node. So they are not peers to the cli. (except in the context of
node start
, wherein the cli becomes a node).
Yes, I see what you meant and it makes sense. My preference is to stick with host
or hosts
, but I'm pretty bad with naming.
It looks like clap
is really having trouble parsing --arg opt1 opt2
. But I did a test by putting value_delimiter=','
in the clap
macro and it managed to parse something like --arg opt1,opt2
, but in the clap
help text the value_delimiter
was not shown, which can cause confusion.
Do you think it's a good idea to just leave --host
without the short option?
I did a test by putting
value_delimiter=','
Nice find, it works. I also tried value_delimiter=' '
but it doesn't parse.
I changed the help text to:
--host <host> specify host(s) to connect to. x.x.x.x:port (comma separated)
btw the space separated values used to be the default for Vec args in structopt, but when structopt was recently merged into clap, they made this change. afaict from looking at clap github issues, the reasoning is that "the unix way" is to use multi occurence args, eg: -h <host> -h <host> -h <host>
, and that this is far more common in cli programs anyway. So that seems to be the direction they are pushing.... I almost wonder if this value_delimiter workaround is an oversight they might "fix" soon.
Ok, so with that background in mind...
Do you think it's a good idea to just leave --host without the short option?
I think it depends on if we want to use/document the multi occurrence args.
In particular publish --host <x> --host <y> --host <z>
is much more annoying to type than -h <x> -h <y> -h <z>
.
So then we are back to -h
interfering with help. So I've been leaning toward using --node
and -n
for this.
But... now we have the wrinkle that --host x,y,z
also works. So we could just make that the documented way of doing it, and the multi occurrence method is hidden/undocumented.
With all that taken into consideration, my preference is for using --node, -n
combined with value_delimiter=','
and help text specify node(s) to connect to. x.x.x.x:port (comma separated)
. This enables:
publish -n x,y,z
publish -n x -n y -n z
publish --node x,y,z
publish --node x --node y --node z
I will push a commit for your consideration.
What may have happened is that nodes other than localhost have already mined its transactions and therefore removed this transaction from the mempool, thus allowing their reinsertion.
I'm not sure I understand. So is this is a situation where the network gets out of sync because the p2p layer is not finished? I would think that (non-mining) localhost would be informed by the mining nodes of the new blocks and update its state to match.
Also, I just restarted my localhost node with kindelia node start --mine
and I still see the same behavior when publishing.
Have you tried publishing to multiple peers, and if so do you see the same behavior?
So we could just make that the documented way of doing it, and the multi occurrence method is hidden/undocumented.
Looks good to me, but since we're going to use the comma and make this the standard way, I prefer it to be --nodes
and not --node
I'm not sure I understand.
What may have happened is that the two other nodes have mined a block that will compete for the end of the chain and these blocks have not yet been included by localhost, which has not mined one of its own (which would remove the transaction from the mempool) and has not received the block mined from other nodes or received and not included (to include a block it is necessary to have its prev
included, which is done through messages like GiveMeThatBlock
until you have all the necessary prev
).
Have you tried publishing to multiple peers, and if so do you see the same behavior?
I tried running kindelia publish example/example.kdl --host 64.227.110.69:80 --host 188.166.3.140:80
and got the following results:
1 -
Transaction #0: PUBLISHED to http://64.227.110.69/ (tx added to mempool)
Transaction #0: PUBLISHED to http://188.166.3.140/ (tx added to mempool)
2 -
Transaction #0: NOT PUBLISHED to http://64.227.110.69/: Transaction 0x574adb292b12fc4c1c1ec845c8e79f19c7781ca3bf059172431d0c310de1f30c already included on pool
Transaction #0: PUBLISHED to http://188.166.3.140/ (tx added to mempool)
3 -
Transaction #0: PUBLISHED to http://64.227.110.69/ (tx added to mempool)
Transaction #0: NOT PUBLISHED to http://188.166.3.140/: Transaction 0x574adb292b12fc4c1c1ec845c8e79f19c7781ca3bf059172431d0c310de1f30c already included on pool
What I think that could have happened is:
1 -
188.166.3.140 - added transaction to mempool
64.227.110.69 - added transaction to mempool
64.227.110.69 - mined a block
2 -
188.166.3.140 - not added, already included
64.227.110.69 - added transaction to mempool
188.166.3.140 - mined a block or received the `64.227.110.69` block
3 -
188.166.3.140 - added transaction to mempool
64.227.110.69 - not added, already included
To be sure of this we would have to debug these nodes; maybe we could activate the events websocket of these nodes for this purpose.
Looks good to me, but since we're going to use the comma and make this the standard way, I prefer it to be
--nodes
and not--node
done.
To be sure of this we would have to debug these nodes
Ok, I don't know how to go about that, so I think I will leave it in your hands.
Addresses #173.
modifies the publish command to first query "our" node for its peers and then concurrently publish to each peer.
This is just a temporary feature until the p2p layer properly transmits code between nodes.
notes:
Edit by @steinerkelvin
TODO:
--hosts
CLI parameter to override to which hosts to publish