helium / erlang-libp2p

An Erlang implementation of libp2p swarms
https://helium.github.io/erlang-libp2p
Apache License 2.0
121 stars 34 forks source link

Miner can't connect to peers. not_found errors in logs. Results in hotspot not witnessing anything. #417

Open thardie opened 2 years ago

thardie commented 2 years ago

Log lines for failure to connect to challenger and proxys:

2021-12-19 20:38:40.744 1 [error] <0.1560.0>@libp2p_transport_proxy:connect_to:69 failed to dial proxy server "/p2p/1123LJMnvys68Fab5GkA32svQ9u4LdVC396nXUgeesDbbR6o9j8f" not_found

and:

2021-12-19 20:37:54.488 1 [warning] <0.889.1>@miner_onion_server:send_witness:243 failed to dial challenger "/p2p/112JU7o2BDmPYw9v7amWEtk7btSM52gHuEAyj3EcXPYqFB8xsgYj": not_found

This appears to be a bad cache in the local peer book, and can be resolved by calling:

docker exec miner miner peer refresh /p2p/<peerid>

I wrote a python script that monitors the logs and issues this command immediately, and miner retries to connect and is able to witness again (https://github.com/HeliumDIY/helium_workarounds/blob/main/src/fix_not_found_peer.py)

yooitsgreg commented 2 years ago

Hi @thardie,

I believe my original helium hotspot has the same libp2p issue. It has been days since I Sent Beacon and I see in All Activity from explorer that I have "Created Challenge" but no "Challenged Beaconer" activity (assuming this is where p2p network issue drops/loses challenge packet). https://explorer.helium.com/hotspots/112eWyTrhGz6qYM7YSRGvmC1aeW7QUdfVeXALDiGMwCX9sn8c1gp/activity

How are you able to view your hotspot logs? Any information/resources that you can share to help me setup something similar/implement your workaround?

Vagabond commented 2 years ago

So the code already does this for you:

https://github.com/helium/erlang-libp2p/blob/d587854c8a1f093576bbd05ab4616bcc2693e600/src/libp2p_transport_p2p.erl#L81 and I've verified this works, are you sure your script actually helps?

laztopak commented 2 years ago

So the code already does this for you:

https://github.com/helium/erlang-libp2p/blob/d587854c8a1f093576bbd05ab4616bcc2693e600/src/libp2p_transport_p2p.erl#L81

and I've verified this works, are you sure your script actually helps?

how can i apply this fix?

i have pisces p100

serbyxp commented 2 years ago

I believe I am having the same issue...

+---------+-------+ | name |result | +---------+-------+ |connected| yes | |dialable | yes | |nat_type | none | | height |1163251| +---------+-------+


Miner log errors


Transaction related errors

: 2022-01-02 15:37:35.249 9 [error] <0.5546.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/11VJFvyQrHWM3BdeYsPeH4HGQW6PUY8Ztjwiq78gKp5sWepffpK", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:37:35.248 9 [error] <0.5545.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/112vPP7YKpps2HRuA6o8AHCFrKuT4hJWCVrdFezNdM59GYsjtTPR", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:37:35.245 9 [error] <0.5543.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/13oVLUq1JDzhNp9XSqXEUkELMEZaDpjbwbFeCDKL87tZzfBGVum", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:36:57.503 9 [error] <0.5493.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/14fwCPzzv2pXJ9ex8gjm1vAbheRM5oSDPMz5U6MmGnWMvAjwR8g", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:36:57.499 9 [error] <0.5492.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/112AqMigxhVQonXxm7yPz3fYv4KnU48YvUVcaoQQAv4U1qLsb11B", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>>

POC related errors

: 2022-01-02 15:51:21.326 9 [error] <0.1641.0>@miner_poc_statem:send_onion:{1003,13} failed to dial 1st hotspot ("/p2p/11q1XjwvU2zNYiPoAiYvrai3VWwibBtUYXTLoPsNTjnqTunzcWA"): not_found

General errors



********************
Miner P2P Info
********************

+---------+-------+
|  name   |result |
+---------+-------+
|connected|  yes  |
|dialable |  yes  |
|nat_type | none  |
| height  |1163255|
+---------+-------+

********************
Miner log errors
********************

***** Transaction related errors *****

: 
***** POC related errors         *****

: 2022-01-02 15:51:21.326 9 [error] <0.1641.0>@miner_poc_statem:send_onion:{1003,13} failed to dial 1st hotspot ("/p2p/11q1XjwvU2zNYiPoAiYvrai3VWwibBtUYXTLoPsNTjnqTunzcWA"): not_found

***** General errors             *****

: 2022-01-02 15:55:10.411 9 [error] <0.6894.0>@lists:unzip:406 gen_server <0.6894.0> terminated with reason: no function clause matching lists:unzip([<<10,138,164,83,10,32,156,50,230,167,27,27,239,6,36,111,207,225,7,3,67,117,232,193,64,102,180,...>>,...], [], []) line 406
madninja commented 2 years ago

I believe I am having the same issue...

+---------+-------+ | name |result | +---------+-------+ |connected| yes | |dialable | yes | |nat_type | none | | height |1163251| +---------+-------+

Miner log errors

Transaction related errors

: 2022-01-02 15:37:35.249 9 [error] <0.5546.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/11VJFvyQrHWM3BdeYsPeH4HGQW6PUY8Ztjwiq78gKp5sWepffpK", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:37:35.248 9 [error] <0.5545.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/112vPP7YKpps2HRuA6o8AHCFrKuT4hJWCVrdFezNdM59GYsjtTPR", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:37:35.245 9 [error] <0.5543.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/13oVLUq1JDzhNp9XSqXEUkELMEZaDpjbwbFeCDKL87tZzfBGVum", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:36:57.503 9 [error] <0.5493.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/14fwCPzzv2pXJ9ex8gjm1vAbheRM5oSDPMz5U6MmGnWMvAjwR8g", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>> 2022-01-02 15:36:57.499 9 [error] <0.5492.0>@blockchain_txndialer:dial:{142,21} libp2p_framed_stream dial failed. Reason: not_found, To: "/p2p/112AqMigxhVQonXxm7yPz3fYv4KnU48YvUVcaoQQAv4U1qLsb11B", TxnHash: <<242,180,4,149,240,138,217,109,124,197,86,236,137,117,134,228,222,11,54,63,95,69,6,14,220,136,177,177,115,52,80,78>>

POC related errors

: 2022-01-02 15:51:21.326 9 [error] <0.1641.0>@miner_poc_statem:send_onion:{1003,13} failed to dial 1st hotspot ("/p2p/11q1XjwvU2zNYiPoAiYvrai3VWwibBtUYXTLoPsNTjnqTunzcWA"): not_found

General errors


********************
Miner P2P Info
********************

+---------+-------+
|  name   |result |
+---------+-------+
|connected|  yes  |
|dialable |  yes  |
|nat_type | none  |
| height  |1163255|
+---------+-------+

********************
Miner log errors
********************

***** Transaction related errors *****

: 
***** POC related errors         *****

: 2022-01-02 15:51:21.326 9 [error] <0.1641.0>@miner_poc_statem:send_onion:{1003,13} failed to dial 1st hotspot ("/p2p/11q1XjwvU2zNYiPoAiYvrai3VWwibBtUYXTLoPsNTjnqTunzcWA"): not_found

***** General errors             *****

: 2022-01-02 15:55:10.411 9 [error] <0.6894.0>@lists:unzip:406 gen_server <0.6894.0> terminated with reason: no function clause matching lists:unzip([<<10,138,164,83,10,32,156,50,230,167,27,27,239,6,36,111,207,225,7,3,67,117,232,193,64,102,180,...>>,...], [], []) line 406

Could you please (1) supply the hotspot address or name for this one and (2) see if you can get more details from the logs for the "general errors you paster there?

serbyxp commented 2 years ago

The miner name is Noisy Pine Seal. I will try, and figure that out I am new to this whole "dev forums" thing so pardon my formatting and what not. I believe we may have chatted in the helium hotspot discord a day or 2 ago. I had the concentrater issue with original Nebra concentrater . I have pulled quay repo latest miner image pulled for arm64 12/29/2001 Concentrator is running lora-net sx1302_hal I replaced the US global_conf. with the helium github packet forwarder US915 global_conf, Concentrator is NOT running in docker, Miner IS. ports 1680 44158 are bridged to local host where concentrator is running

  1. opt/miner/"log" file name erlang.log.1

    option lmbcs: 5242880

    
    =====
    ===== LOGGING STARTED Sat Jan  1 09:34:22 UTC 2022
    =====
    Exec: /opt/miner/erts-12.2/bin/erlexec -boot /opt/miner/releases/2021.12.29.2/start -mode embedded -boot_var SYSTEM_LIB_DIR /opt/miner/lib -config /opt/miner/releases/2021.12.29.2/
    Root: /opt/miner^M
    /opt/miner^M
    Protocol 'inet_tcp': the name miner@127.0.0.1 seems to be in use by another Erlang node^M^M
  2. file name run_erl.log

argv[0] = sh run_erl [655] Sat Jan 1 09:34:22 2022 Args before exec of shell: run_erl [655] Sat Jan 1 09:34:22 2022 argv[0] = sh run_erl [655] Sat Jan 1 09:34:22 2022 argv[1] = -c run_erl [655] Sat Jan 1 09:34:22 2022 argv[2] = exec "/opt/miner/bin/miner" "console" '' --relx-disable-hooks

  1. /config/ file name sys.config ( this is the overlay, I deleted the EEC key pair as per the documentation) not sure if the formatting is correct or will cause issues in syntax where the block chain is left with brackets with nothing in it...

    ~ %% -- erlang -- [ "config/sys.config", {lager, [ {log_root, "/var/data/log"} ]}, {blockchain, [

    ]}, {miner, [ {jsonrpc_ip, {0,0,0,0}}, %% bind jsonrpc to host when in docker container {use_ebus, false}, {radio_device, { {0,0,0,0}, 1680, {0,0,0,0}, 31341} } ]}

-     I dont want to keep adding things that are pointless if I need to send better documentation just let me know.
-     I will look more through the documentation to see if I missed a step / familiarise myself a bit more with the logs / debug process but this is the last miner info miner peer command outputs.. as of this post date time.

miner peer gossip_peers

["/p2p/13Qx7G1dyvqp9ZoSFgSNmQCooTKzwt9hQVN4TST24fGW7PP4c6q", "/p2p/11t8TE9cpnyjXB74YapgFU5Tte7vm4e7JwnL1hBL9HJb1J3qQti", "/ip4/18.195.81.30/tcp/2154","/ip4/3.38.19.15/tcp/2154", "/p2p/11uk8zp32WEZgPTebfc3w8WS5ggKA81aEcv8jAUMyLg7hd21kBS", "/p2p/1124DfhpHHcTQJbr9idVa17iQ9o99QbmTp7kUpHsDS2DYF5p6WXK", "/p2p/11FkiKc6sW5WW7cztpmDfQJCNA4iQG6kkzhUu9ES4Dduzqw3SNQ", "/p2p/112KGPHyZ4ajAC8Wi5o686qAnKx6F6LHzLuYxFQbzTTACEE9wFLy", "/p2p/112wRyypHxsZC9UkpkbANpPNLuVP7W9S9c94EJCzYGrrFAQkTWRr"]

********************
Miner General Info
********************

+----------------+-----------------------------------------------------------------------------------------------------------+
|      name      |                                                  result                                                   |
+----------------+-----------------------------------------------------------------------------------------------------------+
|   miner name   |                                              noisy-pine-seal                                              |
| mac addresses  |                                            eth0 : 0242AC110003                                            |
|                |                                             lo : 000000000000                                             |
|   block age    |                                                    88                                                     |
|     epoch      |                                                   30589                                                   |
|     height     |                                                  1163331                                                  |
|  sync height   |                                                  1163331                                                  |
|     uptime     |                                  0 Days, 3 Hours, 2 Minutes, 37 Seconds                                   |
| peer book size |                                                   37766                                                   |
|firmware version|                       cat: can't open '/etc/lsb_release': No such file or directory                       |
|gateway details |Gateway location: 631711281102285823Owner address: /p2p/14L2tasTRidrhy9Z4SjeFGANJrVeihXgCXR39kfR2MwxSssY6C9|
+----------------+-----------------------------------------------------------------------------------------------------------+

********************
Miner P2P Info
********************

+---------+-------+
|  name   |result |
+---------+-------+
|connected|  yes  |
|dialable |  yes  |
|nat_type | none  |
| height  |1163331|
+---------+-------+

********************
Miner log errors
********************

***** Transaction related errors *****

: 
***** POC related errors         *****

: 
***** General errors             *****

: 2022-01-02 17:15:03.952 9 [error] <0.11416.0>@blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb:122 gen_server <0.11416.0> terminated with reason: bad argument in call to erlang:iolist_size({file,<<"/var/data/saved-snaps/snap-72b1b162d99abc12d4e5244d39312383e33c9d732f221d3bf13f1898ddc6c5...">>}) in blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb/3 line 122
Thanks, much appreciated.
serbyxp commented 2 years ago

well, I have attached the logs from var/data/log in a file below, its error.log error.log.1 error.log.0 , from console cat command Noisy Pine Seal , I finally got witnessed by another hotpot somewhat near me, But there are are plenty of other hotspots that should have seen me... as I mentioned I have the antenna in the same spot as prior to my concentrator @madninja

thardie commented 2 years ago

well, I have attached the logs from var/data/log in a file below, its error.log error.log.1 error.log.0 , from console cat command Noisy Pine Seal , I finally got witnessed by another hotpot somewhat near me, But there are are plenty of other hotspots that should have seen me... as I mentioned I have the antenna in the same spot as prior to my concentrator @madninja

You have the not_found bug. You should be able to use the python script linked above to work around your issue. It just needs to be able to read the console log and be able to find a docker container named "miner".

thardie commented 2 years ago

Hi @thardie,

I believe my original helium hotspot has the same libp2p issue. It has been days since I Sent Beacon and I see in All Activity from explorer that I have "Created Challenge" but no "Challenged Beaconer" activity (assuming this is where p2p network issue drops/loses challenge packet). https://explorer.helium.com/hotspots/112eWyTrhGz6qYM7YSRGvmC1aeW7QUdfVeXALDiGMwCX9sn8c1gp/activity

How are you able to view your hotspot logs? Any information/resources that you can share to help me setup something similar/implement your workaround?

It depends on your specific hotspot implementation. You need to be able to get shell access to it. Commonly through SSH. If your miner doesn't have SSH access, you have to modify it's image to allow SSH (This is what I did on my Helium OG hotspot, since it's just a Raspberry pi under the hood).

thardie commented 2 years ago

So the code already does this for you:

https://github.com/helium/erlang-libp2p/blob/d587854c8a1f093576bbd05ab4616bcc2693e600/src/libp2p_transport_p2p.erl#L81

and I've verified this works, are you sure your script actually helps?

It's very hard for me to be sure it works. Before I did anything, I'd gone down to zero witnesses. I added this and got up higher than I had before. I've given the script to a bunch of other people who also report that it helps (it certainly doesn't solve all not_found cases).

So, my answer is: Anecdotally, it helps. Not sure why, but I also can't grok Erlang. I do fine with C, C++, Python, Javascript, Golang, etc. Erlang is just incomprehensible to me for some reason.

serbyxp commented 2 years ago

@thardie could it be a versioning issue with erlang and the hotspot OS? when I originally tried to compile the gateway-rs for the light hotspot ( thought it would work with my EEC key plugged into it, I was told otherwise) but when compiling it on raspberry pi 1 b+ 32 bit on legacy raspberry pi OS buster ( successfully) , I also compiled it on raspberry pi 4 64 bit raspberry pi OS bullseye . ( also successfully) , I had run into an issue on the raspberry pi OS Debian Bullseye. I would have to back track my process... but the Erlang repo in various " walk troughs" the Erlang repo that gets put into the APT sources is for Buster 64 bit... and It would fail on Bullseye. I found another "walk through" that was for Bullseye specifically, Then did the whole Cargo thing and it compiled... I kinda left it there since... Then I found out I couldnt use Gateway-RS yet for PoC so I just followed the docker integration helium doc and used the prebuilt container image for regular miner.... What does this mean ... No Clue...

thardie commented 2 years ago

@thardie could it be a versioning issue with erlang and the hotspot OS? when I originally tried to compile the gateway-rs for the light hotspot ( thought it would work with my EEC key plugged into it, I was told otherwise) but when compiling it on raspberry pi 1 b+ 32 bit on legacy raspberry pi OS buster ( successfully) , I also compiled it on raspberry pi 4 64 bit raspberry pi OS bullseye . ( also successfully) , I had run into an issue on the raspberry pi OS Debian Bullseye. I would have to back track my process... but the Erlang repo in various " walk troughs" the Erlang repo that gets put into the APT sources is for Buster 64 bit... and It would fail on Bullseye. I found another "walk through" that was for Bullseye specifically, Then did the whole Cargo thing and it compiled... I kinda left it there since... Then I found out I couldnt use Gateway-RS yet for PoC so I just followed the docker integration helium doc and used the prebuilt container image for regular miner.... What does this mean ... No Clue...

  • could it be possible that the Arm64 version on Team Helium Quay may be built on the repos or dependencies of ARM64 buster or BalenaOS, and causing the " bug" on miners not running the same " host OS" as the Helium image?

This doesn't explain hotspots who are working fine that spontaneously stop working. Also, even with my workaround, refreshing a peer doesn't always work. I can prove the peer exists, yet doing a refresh still doesn't find the peer.

I also got gateway-rs to work, but it doesn't support witnessing yet, so not much point in running it.

serbyxp commented 2 years ago

@thardie the only reason I mentioned it, was the fact that all the different manufactures have their OS set up differently, Im sure they all run some sort of docker container system, but I have been trying to get my aftermarket concentrator sx1302 working with my original nebra indoor miner. I didnt want to delete the OS and so on, So I started with using my spare raspis , to get the container working I don't know much about docker or coding so, I just mentioned it in the case it may be useful to someone that knows about it. What I noticed, Is some manufactures have several containers doing different functions like the Dbus, gateway, networking, packet forwarder etc... I noticed with nebra they have a lot of their modules broken out into python scripts. but others may not. Could it be that by just changing the one container of the helium miner 12/29/2021 and leaving the rest as is, cause some sort of conflict with the Erlang, IF the Erlang repo is being linked or sourced from another container?

My miner isnt using any nebra software just the " guts" with a separate raspi jumper wired into the main board. everything works... just the "bug" so Im not saying nebra software is having this bug. but the official helium miner pull from quay is. the only container I have running is that one official helium. The concentrator is running as per the documentation at Lora-net sx1302_hal which is just a normal ./lora_pktfwd -c global..... as far as the concentrator talking to the container. Its working... just this one bug... now, is that bug the reason my beacons arent being seen by anyone when I'm next 100s of hotspots.. I doubt it, I was witnessed 1 time, unless the 100s of miners next to me are all under the same " bug" that they cant witness then it doesnt matter if my becon gets sent or not they wont be able to send it to its peers... im trying to recompile this thing on another system on my laptop with the AMD version I will update if the logs show the same results

serbyxp commented 2 years ago

OK so I still haven't added my EEC Hardware to the AMD miner container on my laptop, I'm letting it catch up to the chain... but its not throwing any errors. I re installed the arm64-latest keeping the same directory as before * same process as it explains in the helium documentation for updating, I added the --mount bind option with a bind-propagation=shared to see if that would change anything, but Im still pulling the same errors on that line 112 bad argument.. but 0 errors on the AMD version... not sure who ever has the Erlang repo for AMD might want to try and compile the ARM64 version or compare the two... I would do it Eventually I will... but its all Chinese to me ( no offense)

thardie commented 2 years ago

@thardie the only reason I mentioned it, was the fact that all the different manufactures have their OS set up differently, Im sure they all run some sort of docker container system, but I have been trying to get my aftermarket concentrator sx1302 working with my original nebra indoor miner. I didnt want to delete the OS and so on, So I started with using my spare raspis , to get the container working I don't know much about docker or coding so, I just mentioned it in the case it may be useful to someone that knows about it. What I noticed, Is some manufactures have several containers doing different functions like the Dbus, gateway, networking, packet forwarder etc... I noticed with nebra they have a lot of their modules broken out into python scripts. but others may not. Could it be that by just changing the one container of the helium miner 12/29/2021 and leaving the rest as is, cause some sort of conflict with the Erlang, IF the Erlang repo is being linked or sourced from another container?

My miner isnt using any nebra software just the " guts" with a separate raspi jumper wired into the main board. everything works... just the "bug" so Im not saying nebra software is having this bug. but the official helium miner pull from quay is. the only container I have running is that one official helium. The concentrator is running as per the documentation at Lora-net sx1302_hal which is just a normal ./lora_pktfwd -c global..... as far as the concentrator talking to the container. Its working... just this one bug... now, is that bug the reason my beacons arent being seen by anyone when I'm next 100s of hotspots.. I doubt it, I was witnessed 1 time, unless the 100s of miners next to me are all under the same " bug" that they cant witness then it doesnt matter if my becon gets sent or not they wont be able to send it to its peers... im trying to recompile this thing on another system on my laptop with the AMD version I will update if the logs show the same results

The whole idea with containers is that they are completely isolated from each other and isolated from the host OS (mostly, but specifically they don't use the host OS's or other containers libraries), so there isn't a way 1 container could mess up another container's erlang environment. The big thing you'll need if you run something like miner or gateway-rs inside a container is to map through /dev/i2c-1 from the hostOS (For Pisces miners, you need to map /dev/i2c-0 from the host OS to /dev/i2c-1 inside the container).

If you're running something in a container that needs to talk to the LoRaWan module, you need to map through /dev/spidev0.0 and /dev/spidev0.1 and allow the container to do RAWIO.

Check here for details on how each of these containers is set up:

https://github.com/HeliumDIY/helium_ansible/blob/f4d6246dae44f95cdca365e20e696c97c0a7f0f0/roles/miner/templates/docker-compose.yml

serbyxp commented 2 years ago

Ill check that out today. I understand the containers are "self contained" but if you got one self contained thing talking to another expecting a specific response, then it could cause a conflict. I noticed there was a new validator update sometime in the past 24 hours. as far as the " Peer bug" maybe the issue isnt on the hotspot end it could be on the validators end. My concentrator has 3 lights on it that turn on when a beacon happens. and it in fact turns on. Corresponds with the message inthe lora_pkt_fwd terminal. But I did get witnessed 1 time out of maybe 5 beacons within the past 36 hours. So the antenna is working the concentrator is working. The issue is if all my witnesses cant report their "witnessing" to the network cus of the bug. Its going to keep the network bogged down. I reasserted the antenna gain and the elevation to my first floor not my roof. just to kind of "refresh" the system but Still nothing. I reasserted it like 2 times since I have gotten involved with trying to debug this.

What iim not sure of is the reassert doesnt happen over LoRa or the radios... its happening over the internet or p2p network... ( I think) but we still get charged dc for that. IF in fact it is over the radio, then I have successfully sent the update 2x

*** the AMD version of the image finished syncing. The only error I am getting on that is a failed to dial a proxy server not the Snapshot issue. But I dont have that one port forwarded or with EEC key plugged into it. So I assume that is the issue. I am going to figure out the EEC key and my laptop, going to throw some micropython at it and see if I can get the EEC key to work on my laptop via a "USB debug" for ATECC608a I will update when / IF i get it to work.

serbyxp commented 2 years ago

As far as the "peer bug" I have not been seeing any peer issues recently my latest info summary... is having snapshot issues on the ARM64.. not sure if this is related or not.

General errors

: 2022-01-04 18:17:11.668 7 [error] <0.17967.1>@blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb:122 gen_server <0.17967.1> terminated with reason: bad argument in call to erlang:iolist_size({file,<<"/var/data/saved-snaps/snap-72b1b162d99abc12d4e5244d39312383e33c9d732f221d3bf13f1898ddc6c5...">>}) in blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb/3 line 122 2022-01-04 18:14:17.594 7 [error] <0.17654.1>@blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb:122 gen_server <0.17654.1> terminated with reason: bad argument in call to erlang:iolist_size({file,<<"/var/data/saved-snaps/snap-72b1b162d99abc12d4e5244d39312383e33c9d732f221d3bf13f1898ddc6c5...">>}) in blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb/3 line 122 2022-01-04 18:10:19.679 7 [error] <0.17371.1>@blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb:122 gen_server <0.17371.1> terminated with reason: bad argument in call to erlang:iolist_size({file,<<"/var/data/saved-snaps/snap-72b1b162d99abc12d4e5244d39312383e33c9d732f221d3bf13f1898ddc6c5...">>}) in blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb/3 line 122 2022-01-04 18:05:39.667 7 [error] <0.16981.1>@blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb:122 gen_server <0.16981.1> terminated with reason: bad argument in call to erlang:iolist_size({file,<<"/var/data/saved-snaps/snap-72b1b162d99abc12d4e5244d39312383e33c9d732f221d3bf13f1898ddc6c5...">>}) in blockchain_snapshot_handler_pb:encode_msg_blockchain_snapshot_resp_pb/3 line 122

kbh022 commented 2 years ago

Hi @thardie. I am having exactly same errors for my hotspot "https://explorer.helium.com/hotspots/112Y2WVurWxWZsegNMCKGe5Ty7TmD4U57Xb3oRwU3zxbTiyJDLRi". I think the script provided by you may be super useful to me. However, unfortunately, I am completely new to python coding and hence, unable to execute the script.

2022-04-09 20:06:52.858 7 [error] <0.1832.0>@miner_poc_statem:send_onion:{1022,13} failed to dial 1st hotspot ("/p2p/112vK3Kfc3t8WriUg14vbodjNaozphF9gL65YXUGMYSofpD59h2h"): not_found

2022-04-09 20:07:02.862 7 [error] <0.1832.0>@miner_poc_statem:send_onion:{1022,13} failed to dial 1st hotspot ("/p2p/112vK3Kfc3t8WriUg14vbodjNaozphF9gL65YXUGMYSofpD59h2h"): not_found

2022-04-09 20:07:12.870 7 [error] <0.1832.0>@miner_poc_statem:send_onion:{1022,13} failed to dial 1st hotspot ("/p2p/112vK3Kfc3t8WriUg14vbodjNaozphF9gL65YXUGMYSofpD59h2h"): not_found

2022-04-09 20:07:22.874 7 [error] <0.1832.0>@miner_poc_statem:handle_challenging:{579,29} failed to dial 1st hotspot ("/p2p/112vK3Kfc3t8WriUg14vbodjNaozphF9gL65YXUGMYSofpD59h2h"): retries_exceeded

Any further guidance from you regarding this would be highly useful to me.

serbyxp commented 2 years ago

Heh wrong place for this, but where you downloaded the script , go to the issues page there, I kinda made a small write up on it… I’m not a coder but, I changed that script to work better. And made a small write up on it since they have no documentation . If it’s not in that repo maybe in the main repo. That script is ment to run in a docker container. Just execute the DockerFile or docker compose file for it and it should just start working. If you want to make the changes to it then read whatever I wrote , I’m not very good at explaining coding things. But you would need to learn some basic docker commands which is on docker website. But typically “docker compose up” in the folder / directory where you downloaded it and the docker compose file is located will add it to your docker instance and should reload it. Make the changes I suggested if you like it works much better … you would edit the file before you do docker compose up or after by doing like “docker exec “”container name”” vi “”script name”” “ and edit it …replacing “” “” with the proper name etc

kbh022 commented 2 years ago

Many thanks @serbyxp! Will try this...

serbyxp commented 2 years ago

The refresh thing works, but not as good as my changes, instead of it doing a peer refresh , I changed it to do a peer connect. I mentioned this to helium devs. Not the script, but by using that script, I figured out that the way the miners try and send messages to each other Is an issue, almost everyone has this issue. But I mentioned to them instead of just sending the receipt to the /p2p/Addr that a connection needs to be established first, then the message be sent. But in like 6 days light hot spot goes live so libp2p is done , and all receipts etc… are handled by validators… so you might just want to wait the 6 days and avoid the headache of learning all that if you don’t understand it. The gRPC should work better . Keyword Should.

kbh022 commented 2 years ago

Hi,

Just started to get into the docker thing, to which I am completely new!. I was already into its user manual while it's being installed. Then I got your second email reminding me about the launch of light hot spot😃. I realised that while I have already waited for over a month, I should wait for 6 more days especially if the problem will most likely be solved without getting into this completely new and advanced docker thing.

Anyway, many thanks for your response and willingness to help!

On Sun, Apr 10, 2022 at 6:15 PM serbyxp @.***> wrote:

The refresh thing works, but not as good as my changes, instead of it doing a peer refresh , I changed it to do a peer connect. I mentioned this to helium devs. Not the script, but by using that script, I figured out that the way the miners try and send messages to each other Is an issue, almost everyone has this issue. But I mentioned to them instead of just sending the receipt to the /p2p/Addr that a connection needs to be established first, then the message be sent. But in like 6 days light hot spot goes live so libp2p is done , and all receipts etc… are handled by validators… so you might just want to wait the 6 days and avoid the headache of learning all that if you don’t understand it. The gRPC should work better . Keyword Should.

— Reply to this email directly, view it on GitHub https://github.com/helium/erlang-libp2p/issues/417#issuecomment-1094305749, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALK3UIQ6VTAFHRHGKBNH5PTVEL5DVANCNFSM5K3II7BQ . You are receiving this because you commented.Message ID: @.***>