Handling flaky agent connections

arjoonn-s commented 1 month ago

Is your feature request related to a problem? Please describe.

I run dozzle on my laptop and agents on multiple server machines that I have.
I've mentioned all machines in DOZZLE_REMOVE_AGENT separated by commas
Now, some of these machines are not on wired connections so they go online / offline sometimes.
The issue is that if a machine is offline when I 'start' the laptop dozzle, it will never be shown in the list even if it is online after a while.

Describe the solution you'd like

some way to see 'offline' or 'unreachable' agents in the UI and try a re-connection via the UI itself.

Describe alternatives you've considered

Right now I just keep restarting dozzle until it connects to everything

amir20 commented 1 month ago

Yup this is in my to do list.

arjoonn-s commented 1 month ago

Is there anything I can do to help? I'm new to go lang so if this is a simple thing to do is love to have a go at it.

On Thu, 11 Jul 2024, 19:08 Amir Raminfar, @.***> wrote:

Yup this is in my to do list.

— Reply to this email directly, view it on GitHub https://github.com/amir20/dozzle/issues/3094#issuecomment-2222966418, or unsubscribe https://github.com/notifications/unsubscribe-auth/A4T35NLIXPALMJM6ZFCJ35LZL2DFDAVCNFSM6AAAAABKWVQIE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRSHE3DMNBRHA . You are receiving this because you authored the thread.Message ID: @.***>

amir20 commented 1 month ago

I'll take any help possible. I am realizing this project is taking a lot of time.

I would first try to pull the project, and follow directions in the README file. A lot has changed so I hope https://github.com/amir20/dozzle?tab=readme-ov-file#building is still valid.

This particular issue is a bit tricky, which is why I haven't gotten around to it yet. Dozzle needs a unique ID for each agent to work properly. It currently uses Docker's ID from docker system info as I don't have a better solution. The agent will return an error if fetching and ID fails. You can see that here https://github.com/amir20/dozzle/blob/4077d5ab474120a87a20c458c032bcddd0eb6671/internal/agent/client.go#L57

If the agent is not available, agent's ID fails and Dozzle can't proceed. So the easiest solution for me was to just remove it. This would probably need to change to something else where Dozzle can still proceed.

I would try to learn the code and once you get the idea I think some ideas could be:

Move failed agents to a temporary list to retry in the background
Assign a random ID and keep retrying when page reloads

I honestly don't know the best option but if you can help out thinking about design and a little code writing then it would be help me a lot.

amir20 commented 1 month ago

Also, if you are bit experienced, I think we can challenge the idea of a unique ID per agent. I haven't found a better way around it because the ID needs to be unique across a set of hosts. I haven't found a solution where I can find this value without pinging the agent.

amir20 / dozzle

Handling flaky agent connections #3094