workbenchapp / solana-workbench

Your one-stop shop for developing on Solana
MIT License
64 stars 7 forks source link

Can't deploy programs to the localnet validator started by workbench #26

Closed SvenDowideit closed 2 years ago

SvenDowideit commented 2 years ago

yeah, me too

OK, so using the non-container solana-test-validator works, so yay for containers.

SvenDowideit commented 2 years ago

mmm, ok, not worknet - I'm getting the same problem from https://github.com/solana-labs/example-helloworld/blob/master/README.md

SvenDowideit commented 2 years ago

I wonder if the local validator did go pop - there's not much in my logs, but it does end in

[2022-03-01T03:04:05.947965373Z ERROR solana_core::cluster_info_vote_listener] thread Some("solana-cluster_info_bank_send") error CrossbeamRecvTimeout(Timeout)

nope, restarting it didn't do it.

nathanleclaire commented 2 years ago

local validators in docker are based on the solana CI images, which might have non-standard config, i assumed they would be OK, but apparently not for real use. currently experimenting with building a custom solana docker image to see if it will work better

nathanleclaire commented 2 years ago

blech, even after trying building Solana in Docker from rust, and opened all the ports for UDP and all that, the upload still will not work. (this is 1.8.* too, and also tried with --no-bpf-jit)

I wonder what is going on and how to debug. https://github.com/solana-labs/solana/issues/18033 seems related, but you also couldn't get it to work on your x86_64 linux box right?

SvenDowideit commented 2 years ago

maybe - but i didn't actually poke it hard.

nathanleclaire commented 2 years ago

so this works fine within the container itself (after installing ca-certificates), must be network related?

root@cbe4ae8bbedd:/# solana -v program deploy mango_logs.so 
RPC URL: http://localhost:8899
Default Signer Path: /root/.config/solana/id.json
Commitment: confirmed
Program Id: G1LAWAwaiFDFW6mDTFabQHnyMNNAQw96UT2Zfoud3mz8
nathanleclaire commented 2 years ago

will have to double check that these are available, i wonder what the high ports are

root@cbe4ae8bbedd:/# netstat -tulpn 
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:1024            0.0.0.0:*               LISTEN      1/solana-test-valid 
tcp        0      0 0.0.0.0:8899            0.0.0.0:*               LISTEN      1/solana-test-valid 
tcp        0      0 0.0.0.0:8900            0.0.0.0:*               LISTEN      1/solana-test-valid 
tcp        0      0 0.0.0.0:9900            0.0.0.0:*               LISTEN      1/solana-test-valid 
udp        0      0 0.0.0.0:41827           0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1024            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1025            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1026            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1027            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1028            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1029            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1030            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1031            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1032            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:1033            0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:50689           0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:59939           0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:35869           0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:36489           0.0.0.0:*                           1/solana-test-valid 
udp        0      0 0.0.0.0:53648           0.0.0.0:*                           1/solana-test-valid 

the ports i have open are:

0.0.0.0:8001->8001/tcp, 0.0.0.0:8899-8902->8899-8902/tcp, 0.0.0.0:8000-8009->8000-8009/udp, 0.0.0.0:9900->9900/tcp

this is on D4M

nathanleclaire commented 2 years ago

ok this is weird, solana-test-validator seems to not be respecting the --dynamic-port-range flag

nathanleclaire commented 2 years ago

hm i think it does but i have no idea what's up with the high ports (>10000, can't find any mention in the docs) and those seem like they might be key to getting the deploy to work

nathanleclaire commented 2 years ago

bah, on the official images (and consequently, nathanleclaire/solana), there's this intermediate run script, seems perhaps the args aren't getting percolated through correctly, will re-try with my built image

bash /usr/bin/solana-run.sh solana-test-validator --limit-ledger-size 50000000 --dynamic-port-range 11000-11020
\_ solana-faucet
\_ solana-validator --identity /usr/bin/config/solana-run/validator-identity.json --vote-account /usr/bin/config/solana-run/validator-vote-account.json --ledger /usr/bin/config/ledger --gossip-port 8001 --rpc-port 8899 --rpc-faucet-address 127.0.0.1:9900 --log - --enable-rpc-transaction-history --enable-cpi-and-log-storage --init-complete-file /usr/bin/config/solana-run/init-completed --snapshot-compression none --require-tower --no-wait-for-vote-to-start-leader
nathanleclaire commented 2 years ago

mm ok seems like progress cause now it's hung on txn confirmation, not the sending txns like it was before...

Screen Shot 2022-03-15 at 8 25 10 AM

can't really quite figure out why these high ports still aren't landing in dynamic-port-range though, perhaps that's causing the hang

image

nathanleclaire commented 2 years ago

trying some additional logging in the client to untangle this, looks like it gets stuck on trying to confirm the transaction, the RPC call to do so seems to never return ...

index 759b4a332..bd21fc809 100644
--- a/client/src/rpc_client.rs
+++ b/client/src/rpc_client.rs
@@ -1616,10 +1616,12 @@ impl RpcClient {
         signature: &Signature,
         commitment_config: CommitmentConfig,
     ) -> ClientResult<Option<transaction::Result<()>>> {
-        let result: Response<Vec<Option<TransactionStatus>>> = self.send(
-            RpcRequest::GetSignatureStatuses,
-            json!([[signature.to_string()]]),
-        )?;
+        println!("get_signature_status_with_commitment");
+        let params = json!([[signature.to_string()]]);
+        println!("params: {:?} ", params);
+        let result: Response<Vec<Option<TransactionStatus>>> =
+            self.send(RpcRequest::GetSignatureStatuses, params)?;
+        println!("result: {:?}", result);
         Ok(result.value[0]
             .clone()
             .filter(|result| result.satisfies_commitment(commitment_config))
Screen Shot 2022-03-16 at 11 24 57 AM
nathanleclaire commented 2 years ago

well this is truly bizarre, when I add another log line in the http rpc sender, i actually do get result logged

diff --git a/client/src/http_sender.rs b/client/src/http_sender.rs
index 0a7558c6f..8531fe7e9 100644
--- a/client/src/http_sender.rs
+++ b/client/src/http_sender.rs
@@ -118,6 +118,7 @@ impl RpcSender for HttpSender {
             let response = {
                 let client = self.client.clone();
                 let request_json = request_json.clone();
+                println!("url: ${:?}", self.url);
                 tokio::task::block_in_place(move || {
                     client
                         .post(&self.url)
Screen Shot 2022-03-16 at 11 34 37 AM

seems racey

nathanleclaire commented 2 years ago

Client works fine on devnet >_>

nathanleclaire commented 2 years ago

mm, now I can't get it to run in the container either...

root@6bd4e825b143:/# strace -e trace=connect -f solana program deploy mango.so 
strace: Process 576 attached
strace: Process 577 attached
[pid   577] connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid   577] connect(9, {sa_family=AF_UNIX, sun_path="/var/run/nscd/socket"}, 110) = -1 ENOENT (No such file or directory)
[pid   577] connect(9, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid   577] connect(9, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address)
[pid   576] connect(9, {sa_family=AF_INET, sin_port=htons(8899), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
strace: Process 578 attached
[pid   575] connect(10, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
[pid   575] connect(10, {sa_family=AF_INET6, sin6_port=htons(0), sin6_flowinfo=htonl(0), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_scope_id=0}, 28) = -1 EADDRNOTAVAIL (Cannot assign requested address)
[pid   575] connect(10, {sa_family=AF_INET, sin_port=htons(8900), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
strace: Process 579 attached
strace: Process 580 attached
strace: Process 581 attached
⠁ 
[pid   581] +++ exited with 0 +++
[pid   580] +++ exited with 0 +++
[pid   579] +++ exited with 0 +++
strace: Process 582 attached
⠤ [0/1] Finalizing transaction 2vtUgzt7Rbo1nD32NGzNuH21zWkCN3xYyGXqjMpp9zdoNwbdD67iY172bz7oomhgd5ge2KGAN1BhsmcXEKSU3M7v
⠒ [0/1] Finalizing transaction 2vtUgzt7Rbo1nD32NGzNuH21zWkCN3xYyGXqjMpp9zdoNwbdD67iY172bz7oomhgd5ge2KGAN1BhsmcXEKSU3M7v
nathanleclaire commented 2 years ago

it's like this thing just gives up after a while, truly strange, after fiddling a bit with the test validator image i've been using for debugging, even a simple airdrop transaction won't go through ...

root@5ea50bafe534:/# solana airdrop 100 3FJKbjZ1XCW9FGp7H2bfm367vkdNZtdfF9aUb9oASvuR
Requesting airdrop of 100 SOL
⠈ [0/1] Finalizing transaction 26Pgx69v3LdHArfUVqsSKEQdcv6ZidTRRUpdR1JRejbnscUAYHbpmYkvoC2sZQyBmAU9cE5cJzQtjBhyJEHmXmtX

edit: OK so airdrop works fine prior to CTRL+Cing out of the hanging program deploy, something about that seems to screw everything up

nathanleclaire commented 2 years ago

looks like getSignatureStatuses is returning {"value": [null]}, but I can't say I understand why exactly

Screen Shot 2022-03-16 at 1 56 31 PM
nathanleclaire commented 2 years ago

mm yea in a totally fresh container the deploy works, makes me think somehow the state gets cooked with a failed deploy

nathanleclaire commented 2 years ago

hm, --no-bpf-jit flag for solana-test-validator seems to play a role here... deploy within container seems to succeed with it but throws error without

nathanleclaire commented 2 years ago

this is all on ARM Macbook

nathanleclaire commented 2 years ago

Not quite fixed yet unfortunately. def think --no-bpf-jit was progress.

With latest config I get these results. Hangs on "Waiting for the next block" / "Checking transaction status"

Screen Shot 2022-04-06 at 11 33 57 AM

Seeing these warnings in logs. Think they might be relevant, esp. window service port config ...

[2022-04-06T18:27:13.819681672Z WARN  solana_core::validator] encountered error removing accounts path: "test-ledger/accounts": No such file or directory (os error 2)
[2022-04-06T18:27:15.043971839Z WARN  solana_sys_tuner] Failed to send tuning request, is `solana-sys-tuner` running? Os { code: 2, kind: NotFound, message: "No such file or directory" }
[2022-04-06T18:27:45.197399340Z WARN  solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
[2022-04-06T18:28:15.361589881Z WARN  solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
[2022-04-06T18:28:45.545952923Z WARN  solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
[2022-04-06T18:29:15.711853881Z WARN  solana_core::window_service] Window does not seem to be receiving data. Ensure port configuration is correct...
nathanleclaire commented 2 years ago

some notes on this given that it now seems to be fixed (?) on my M1:

Solana ports still confuse me. --dynamic-port-range fills the ports up to 10008 (at least on this container run), then decides to go for random high ports:

$ docker exec -ti solana-test-validator netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name    
tcp        0      0 0.0.0.0:9900            0.0.0.0:*               LISTEN      8/solana-test-valid 
tcp        0      0 0.0.0.0:10000           0.0.0.0:*               LISTEN      8/solana-test-valid 
tcp        0      0 0.0.0.0:8899            0.0.0.0:*               LISTEN      8/solana-test-valid 
tcp        0      0 0.0.0.0:8900            0.0.0.0:*               LISTEN      8/solana-test-valid 
udp        0      0 0.0.0.0:37696           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:54155           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:39863           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:49405           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:58445           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:58774           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10000           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10001           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10002           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10003           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10004           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10005           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10006           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10007           0.0.0.0:*                           8/solana-test-valid 
udp        0      0 0.0.0.0:10008           0.0.0.0:*                           8/solana-test-valid

Six UDP ports on there that have me scratching my head. Plus, what I remember from before is that ports higher than 10008, like 10009 were filled too.

Still get the warning Window does not seem to be receiving data. Ensure port configuration is correct... in the logs. Hope that doesn't come back to bite later.