timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-68785] No log messages for inbound agents after the first message (regression in 2.310) #6072

Open timja opened 2 years ago

timja commented 2 years ago

Jenkins 2.343 and later (including 2.346.1) reject connections by unsupported remoting versions.  Connections from an agent using a remoting version older than 3.14 are rejected. 

Users that want to connect those older agents must set the "escape hatch" system property {{hudson.slaves.SlaveComputer.allowUnsupportedRemotingVersions=true  }}.
When the connection is rejected, the implementation provides a message, but when running Jenkins 2.346.1-rc2, the message does not appear.

I've confirmed that a connection from an unsupported remoting version (3.12 in my case) is correctly rejected when the escape hatch has its default value. I've confirmed that a connection from an unsupported remoting version is accepted when the escape hatch value is set to true. The end result is functioning as expected, though the message to the administrator is missing.

Expected result

When an unsupported remoting version connects to the Jenkins controller, the rejected connection should be logged in a way that is available to the administrator.

Actual result

When an unsupported remoting version connects to the Jenkins controller, the rejected connection is noted on the Jenkins agent web page with the message:

This agent is offline because Jenkins failed to launch the agent process on it. See log for more details

The "See log for more details" text is a hyperlink. The log page available through that hyperlink includes the message on my installation:

Inbound agent connected from testing-a.markwaite.net/172.16.16.113:60880

It does not include any message explaining why the connection was rejected.

It would help the administrator understand the problem if an explanatory message were displayed on the agent log page.


Originally reported by markewaite, imported from: No log messages for inbound agents after the first message (regression in 2.310)
  • assignee: basil
  • status: In Review
  • priority: Minor
  • resolution: Unresolved
  • imported: 2022/01/10
timja commented 2 years ago

basil:

https://github.com/jenkinsci/jenkins/blob/ac66b476d94a6df70bd14a180454ebd804ab70f9/test/src/test/java/jenkins/slaves/UnsupportedRemotingAgentTest.java#L48= ?

timja commented 2 years ago

markewaite:

basil I was as perplexed as anyone that the message asserted by the unit test is not visible from the user interface. It is a good message and would help the administrator if it were visible to the administrator.

I'm not sure what causes that, only that I don't see the message in the user interface.

timja commented 2 years ago

basil:

I'm not sure what causes that, only that I don't see the message in the user interface.

I see the message in the user interface (screenshot attached). I'm not sure why you didn't include steps to reproduce the problem from scratch on a clean installation.

timja commented 2 years ago

basil:

Seems any log messages for inbound agents after the first message are no longer being printed as of jenkinsci/jenkins#5680.

timja commented 2 years ago

basil:

The unit test was testing outbound agents but the problem only manifest itself with inbound agents. The issue report did not contain steps to reproduce and did not specify that inbound agents were in use other than by chance in the output of the log message. In the future, please provide the steps to reproduce when reporting issues.

timja commented 2 years ago

markewaite:

Apologies that I did not explicitly state that the agent is an inbound agent. Sorry you had to deduce that rather than it being stated directly.

Here are the steps that I've used to duplicate the problem:

  1. Download JENKINS-68785.tgz to a Linux computer and unpack it into an empty directory with tar xzvf JENKINS-68785.tgz
  2. Start the controller from one terminal with the command cd JENKINS-68785/ && bash ./README
  3. Open a web browser to the nodes page of that controller with http://localhost:8080/computer/ , confirm that "an-unsupported-agent" is defined and not connected
  4. Start the inbound agent from another terminal with the command cd JENKINS-68785-agent && bash ./README
  5. Open a web browser to the nodes page and confirm that "an-unsupported-agent" is defined, not connected, and displaying the error message "This agent is offline because Jenkins failed to launch the agent process on it."

The JENKINS-68785/README file includes a comment that can confirm the unsupported agent can connect when the escape hatch is enabled.