robotology / icub-tech-support

Virtual repository that provides support requests for individual robots
GNU General Public License v2.0
20 stars 2 forks source link

iCubGenova09 (iRonCub3) S/N:000 – Xavier board (10.0.0.3) not reachable #1659

Closed gabrielenava closed 7 months ago

gabrielenava commented 11 months ago

Robot Name 🤖

iCubGenova09 (iRonCub3) S/N:000

Request/Failure description

we cannot ping or ssh the Xavier board.

cc @davidegorbani @DanielePucci

Detailed context

While preparing for the iRonCub experiments, we noted that the COM-Express board (NOT the Xavier) suddenly stop replying and it was not possible to ping or ssh it. After few minutes and without doing anything, the COM-Express came back and behaved normally. This strange behavior happened a couple of times, then never happened anymore.

While debugging the problem, we noted that the blue light of the wifi antenna of the Xavier was off. On the board, there was a green light. We tried to ping or ssh the Xavier but we did not gent any reply from the board. Differently from the COM-Express, this behavior was permanent: we were not able to ping or shh the Xavier for the rest of the experiment despite we also switched on/off the robot several times.

I am reporting also the COM-Express strange behavior because I don't know if it is related to the loss of communication with the Xavier. Also it is the second time in few days that a board in the iCub3 head has a failure: #1643

Additional context

No response

How does it affect you?

No response

DanielePucci commented 11 months ago

@gabrielenava concerning

While preparing for the iRonCub experiments, we noted that the COM-Express board (NOT the Xavier) suddenly stop replying and it was not possible to ping or ssh it. After few minutes and without doing anything, the COM-Express came back and behaved normally. This strange behavior happened a couple of times, then never happened anymore.

As @S-Dafarra was mentioning F2F, this behaviour is similar to the problem we had while preparing for the ANA Avatar XPRIZE, see https://github.com/ami-iit/component_ANA-Avatar-XPRIZE/issues/767#issuecomment-1317267421 - @S-Dafarra do not hesitate to add more precise pointers

So I wonder if we are using an image with the bridge still active, or with some old configuration @davidelasagna ?

Concerning

While debugging the problem, we noted that the blue light of the wifi antenna of the Xavier was off. On the board, there was a green light. We tried to ping or ssh the Xavier but we did not gent any reply from the board. Differently from the COM-Express, this behavior was permanent: we were not able to ping or shh the Xavier for the rest of the experiment despite we also switched on/off the robot several times.

Do not know. It is also true that if we do not use the XAVIER, we might physically disconnect it from the head

davidelasagna commented 11 months ago

So I wonder if we are using an image with the bridge still active, or with some old configuration @davidelasagna ?

The bridge is not active, the two boards have their own wireless interface, anyway I could check the configurations on the boards

gabrielenava commented 11 months ago

there is also this ticket @maggia80 @DanielePucci @Fabrizio69 the xavier to me can be even removed from the robot.

the point was if it is possible to check if there are problems in the power supply of the head boards, because we lost both the Xavier and the COM-Express in few days and there might be something wrong with the power.

gabrielenava commented 10 months ago

re-tested today, I confirm we cannot reach the Xavier anymore. This is not blocking us for the moment.

DanielePucci commented 10 months ago

@davidelasagna this is the ticket we were mentioning today

gabrielenava commented 10 months ago

with @davidelasagna we tried to connect to the Xavier board, and it was working this time! I don't know what changed with respect to the previous tests. Meanwhile we opened the head to debug issues with the COM-Express, it is possible that a cable was not connected properly, and while checking all connections before closing the head to be sure we did not make mistakes it is possible we also fixed the Xavier cables. But I am not sure this is the reason.

I think we can close the issue, if new problems with the Xavier emerge I can open a new one!

AntonioConsilvio commented 10 months ago

I think we can close the issue, if new problems with the Xavier emerge I can open a new one!

Thanks for the feedback @gabrielenava. So, we proceed to close this issue since the problem seems gone.

gabrielenava commented 8 months ago

hello everyone, I reopen this issue to comment that the problem happened again, but with the help of @davidelasagna we spotted the cause: it is this power cable in the robot head:

image

if you move it while the board is on, it shuts down. Probably it is damaged inside and when moving there is a short circuit.

At the moment we use only the icub-head board (10.0.0.2) for iRonCub experiments, while this board is necessary only because we are on Ethernet connection and the icub-head is connected to it. Since we need to change this cable, can we also bypass this board so that the ethernet that goes to the external switch is directly connected to the icub-head board?

gabrielenava commented 8 months ago

cc @DanielePucci

sgiraz commented 7 months ago

As discussed F2F, to solve this issue, we are ready to remove the NVIDIA Jetson Xavier NX from IronCub3. After this task, we are aware that the cameras will not be available anymore and we'll be able to connect to the COMM-Express board directly.

@gabrielenava are you agree?

cc @DanielePucci @maggia80 @davidelasagna @AntonioConsilvio @Gandoo @Fabrizio69

gabrielenava commented 7 months ago

@gabrielenava are you agree?

I agree, at the moment we are not using the NX on iRonCub3 and we are not planning to use it. On the other hand, it is not the first time we have issues with this board, and this became riskier now because we are using the Ethernet connection that passes through the NX to communicate with the COM-Express. If there is an issue with the NX, we might loose control of the robot during flight experiments

Gandoo commented 7 months ago

ciao @gabrielenava sul robot è stato tolto il materiale richiesto, ho messo tutto in una scatola almeno la tenete voi.

a9b69c64-7ad7-4ac6-87fb-713772bc8bfd

b417d795-1be1-4c0a-ae07-f82a087aa5a7

5e8bc8b1-eefa-4ac0-94f0-5bf83bb036ce

68c5cc51-199f-45c2-9e6a-84d5dbebb95d

aaca4cd9-b947-453a-87bd-7d65286f5926

cc @DanielePucci @maggia80 @davidelasagna @AntonioConsilvio @sgiraz @Fabrizio69

sgiraz commented 7 months ago

Hi @gabrielenava,

We are waiting for your feedback, then we'll proceed to close this issue.

gabrielenava commented 7 months ago

We are waiting for your feedback, then we'll proceed to close this issue.

hello @sgiraz I have tested the robot today, I did not have any issue and the robot worked fine for the whole day. I think we can close the issue, thank you!