Closed SvenDowideit closed 2 years ago
mmm, ok, not worknet - I'm getting the same problem from https://github.com/solana-labs/example-helloworld/blob/master/README.md
I wonder if the local validator did go pop - there's not much in my logs, but it does end in
[2022-03-01T03:04:05.947965373Z ERROR solana_core::cluster_info_vote_listener] thread Some("solana-cluster_info_bank_send") error CrossbeamRecvTimeout(Timeout)
nope, restarting it didn't do it.
local validators in docker are based on the solana CI images, which might have non-standard config, i assumed they would be OK, but apparently not for real use. currently experimenting with building a custom solana docker image to see if it will work better
blech, even after trying building Solana in Docker from rust
, and opened all the ports for UDP and all that, the upload still will not work. (this is 1.8.* too, and also tried with --no-bpf-jit
)
I wonder what is going on and how to debug. https://github.com/solana-labs/solana/issues/18033 seems related, but you also couldn't get it to work on your x86_64 linux box right?
maybe - but i didn't actually poke it hard.
so this works fine within the container itself (after installing ca-certificates
), must be network related?
root@cbe4ae8bbedd:/# solana -v program deploy mango_logs.so
RPC URL: http://localhost:8899
Default Signer Path: /root/.config/solana/id.json
Commitment: confirmed
Program Id: G1LAWAwaiFDFW6mDTFabQHnyMNNAQw96UT2Zfoud3mz8
will have to double check that these are available, i wonder what the high ports are
root@cbe4ae8bbedd:/# netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:1024 0.0.0.0:* LISTEN 1/solana-test-valid
tcp 0 0 0.0.0.0:8899 0.0.0.0:* LISTEN 1/solana-test-valid
tcp 0 0 0.0.0.0:8900 0.0.0.0:* LISTEN 1/solana-test-valid
tcp 0 0 0.0.0.0:9900 0.0.0.0:* LISTEN 1/solana-test-valid
udp 0 0 0.0.0.0:41827 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1024 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1025 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1026 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1027 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1028 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1029 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1030 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1031 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1032 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:1033 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:50689 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:59939 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:35869 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:36489 0.0.0.0:* 1/solana-test-valid
udp 0 0 0.0.0.0:53648 0.0.0.0:* 1/solana-test-valid
the ports i have open are:
0.0.0.0:8001->8001/tcp, 0.0.0.0:8899-8902->8899-8902/tcp, 0.0.0.0:8000-8009->8000-8009/udp, 0.0.0.0:9900->9900/tcp
this is on D4M
ok this is weird, solana-test-validator
seems to not be respecting the --dynamic-port-range
flag
hm i think it does but i have no idea what's up with the high ports (>10000, can't find any mention in the docs) and those seem like they might be key to getting the deploy to work
bah, on the official images (and consequently, nathanleclaire/solana
), there's this intermediate run script, seems perhaps the args aren't getting percolated through correctly, will re-try with my built image
bash /usr/bin/solana-run.sh solana-test-validator --limit-ledger-size 50000000 --dynamic-port-range 11000-11020
\_ solana-faucet
\_ solana-validator --identity /usr/bin/config/solana-run/validator-identity.json --vote-account /usr/bin/config/solana-run/validator-vote-account.json --ledger /usr/bin/config/ledger --gossip-port 8001 --rpc-port 8899 --rpc-faucet-address 127.0.0.1:9900 --log - --enable-rpc-transaction-history --enable-cpi-and-log-storage --init-complete-file /usr/bin/config/solana-run/init-completed --snapshot-compression none --require-tower --no-wait-for-vote-to-start-leader
mm ok seems like progress cause now it's hung on txn confirmation, not the sending txns like it was before...
can't really quite figure out why these high ports still aren't landing in dynamic-port-range though, perhaps that's causing the hang
trying some additional logging in the client to untangle this, looks like it gets stuck on trying to confirm the transaction, the RPC call to do so seems to never return ...
index 759b4a332..bd21fc809 100644
--- a/client/src/rpc_client.rs
+++ b/client/src/rpc_client.rs
@@ -1616,10 +1616,12 @@ impl RpcClient {
signature: &Signature,
commitment_config: CommitmentConfig,
) -> ClientResult<Option<transaction::Result<()>>> {
- let result: Response<Vec<Option<TransactionStatus>>> = self.send(
- RpcRequest::GetSignatureStatuses,
- json!([[signature.to_string()]]),
- )?;
+ println!("get_signature_status_with_commitment");
+ let params = json!([[signature.to_string()]]);
+ println!("params: {:?} ", params);
+ let result: Response<Vec<Option<TransactionStatus>>> =
+ self.send(RpcRequest::GetSignatureStatuses, params)?;
+ println!("result: {:?}", result);
Ok(result.value[0]
.clone()
.filter(|result| result.satisfies_commitment(commitment_config))
well this is truly bizarre, when I add another log line in the http rpc sender, i actually do get result logged
diff --git a/client/src/http_sender.rs b/client/src/http_sender.rs
index 0a7558c6f..8531fe7e9 100644
--- a/client/src/http_sender.rs
+++ b/client/src/http_sender.rs
@@ -118,6 +118,7 @@ impl RpcSender for HttpSender {
let response = {
let client = self.client.clone();
let request_json = request_json.clone();
+ println!("url: ${:?}", self.url);
tokio::task::block_in_place(move || {
client
.post(&self.url)
seems racey
Client works fine on devnet >_>
mm, now I can't get it to run in the container either...
root@6bd4e825b143:/# strace -e trace=connect -f solana program deploy mango.so
strace: Process 576 attached
strace: Process 577 attached
[pid 577] connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid 577] connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid 577] connect(9, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid 577] connect(9, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address)
[pid 576] connect(9, {sa_family=AF_INET, sin_port=htons(8899), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
strace: Process 578 attached
[pid 575] connect(10, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid 575] connect(10, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address)
[pid 575] connect(10, {sa_family=AF_INET, sin_port=htons(8900), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
strace: Process 579 attached
strace: Process 580 attached
strace: Process 581 attached
⠁
[pid 581] +++ exited with 0 +++
[pid 580] +++ exited with 0 +++
[pid 579] +++ exited with 0 +++
strace: Process 582 attached
⠤ [0/1] Finalizing transaction 2vtUgzt7Rbo1nD32NGzNuH21zWkCN3xYyGXqjMpp9zdoNwbdD67iY172bz7oomhgd5ge2KGAN1BhsmcXEKSU3M7v
⠒ [0/1] Finalizing transaction 2vtUgzt7Rbo1nD32NGzNuH21zWkCN3xYyGXqjMpp9zdoNwbdD67iY172bz7oomhgd5ge2KGAN1BhsmcXEKSU3M7v
it's like this thing just gives up after a while, truly strange, after fiddling a bit with the test validator image i've been using for debugging, even a simple airdrop transaction won't go through ...
root@5ea50bafe534:/# solana airdrop 100 3FJKbjZ1XCW9FGp7H2bfm367vkdNZtdfF9aUb9oASvuR
Requesting airdrop of 100 SOL
⠈ [0/1] Finalizing transaction 26Pgx69v3LdHArfUVqsSKEQdcv6ZidTRRUpdR1JRejbnscUAYHbpmYkvoC2sZQyBmAU9cE5cJzQtjBhyJEHmXmtX
edit: OK so airdrop
works fine prior to CTRL+Cing out of the hanging program deploy
, something about that seems to screw everything up
looks like getSignatureStatuses
is returning {"value": [null]}
, but I can't say I understand why exactly
mm yea in a totally fresh container the deploy works, makes me think somehow the state gets cooked with a failed deploy
hm, --no-bpf-jit
flag for solana-test-validator
seems to play a role here... deploy within container seems to succeed with it but throws error without
this is all on ARM Macbook
Not quite fixed yet unfortunately. def think --no-bpf-jit
was progress.
With latest config I get these results. Hangs on "Waiting for the next block" / "Checking transaction status"
Seeing these warnings in logs. Think they might be relevant, esp. window service port config ...
[2022-04-06T18:27:13.819681672Z WARN solana_core::validator] encountered error removing accounts path: "test-ledger/accounts": No such file or directory (os error 2)
[2022-04-06T18:27:15.043971839Z WARN solana_sys_tuner] Failed to send tuning request, is `solana-sys-tuner` running? Os { code: 2, kind: NotFound, message: "No such file or directory" }
[2022-04-06T18:27:45.197399340Z WARN solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
[2022-04-06T18:28:15.361589881Z WARN solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
[2022-04-06T18:28:45.545952923Z WARN solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
[2022-04-06T18:29:15.711853881Z WARN solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
some notes on this given that it now seems to be fixed (?) on my M1:
Solana ports still confuse me. --dynamic-port-range
fills the ports up to 10008 (at least on this container run), then decides to go for random high ports:
$ docker exec -ti solana-test-validator netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:9900 0.0.0.0:* LISTEN 8/solana-test-valid
tcp 0 0 0.0.0.0:10000 0.0.0.0:* LISTEN 8/solana-test-valid
tcp 0 0 0.0.0.0:8899 0.0.0.0:* LISTEN 8/solana-test-valid
tcp 0 0 0.0.0.0:8900 0.0.0.0:* LISTEN 8/solana-test-valid
udp 0 0 0.0.0.0:37696 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:54155 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:39863 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:49405 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:58445 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:58774 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10000 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10001 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10002 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10003 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10004 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10005 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10006 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10007 0.0.0.0:* 8/solana-test-valid
udp 0 0 0.0.0.0:10008 0.0.0.0:* 8/solana-test-valid
Six UDP ports on there that have me scratching my head. Plus, what I remember from before is that ports higher than 10008, like 10009 were filled too.
Still get the warning Window does not seem to be receiving data. Ensure port configuration is correct...
in the logs. Hope that doesn't come back to bite later.
yeah, me too
OK, so using the non-container
solana-test-validator
works, so yay for containers.