timja / jenkins-gh-issues-poc-06-18

0 stars 0 forks source link

[JENKINS-27739] Changes to slave environment variables are ignored by master #7107

Open timja opened 9 years ago

timja commented 9 years ago

A slave's environment variables seem to be cached upon first connect and then never considered again. This prevents legitimate changes to a slave's environment from having any effect on builds. The only workaround I can find is to delete a slave and recreate it, which in a complex environment is unacceptable.

This may be a consequence of JENKINS-26755.


Originally reported by siggimoo, imported from: Changes to slave environment variables are ignored by master
  • status: Reopened
  • priority: Major
  • resolution: Unresolved
  • imported: 2022/01/10
timja commented 9 years ago

danielbeck:

Issue is missing installed plugins and their versions. Issue is missing specific steps to (easily) reproduce the issue.

timja commented 9 years ago

siggimoo:

Installed Plugins:

Ant Plugin (ant) v1.2
buildgraph-view (buildgraph-view) v1.1.1
CloudBees Build Flow plugin (build-flow-plugin) v0.16
conditional-buildstep (conditional-buildstep) v1.3.3
Copy Artifact Plugin (copyartifact) v1.32.1
Credentials Binding Plugin (credentials-binding) v1.3
Credentials Plugin (credentials) v1.21
Email Extension Plugin (email-ext) v2.39
embeddable-build-status (embeddable-build-status) v1.5
Environment Injector Plugin (envinject) v1.90
External Monitor Job Type Plugin (external-monitor-job) v1.4
GitHub API Plugin (github-api) v1.59
GitHub plugin (github) v1.10
GitHub Pull Request Builder (ghprb) v1.16-5
Green Balls (greenballs) v1.14
Javadoc Plugin (javadoc) v1.3
Jenkins Active Directory plugin (active-directory) v1.39
Jenkins build timeout plugin (build-timeout) v1.14
Jenkins Cobertura Plugin (cobertura) v1.9.6
Jenkins CVS Plug-in (cvs) v2.12
Jenkins description setter plugin (description-setter) v1.9
Jenkins Exclusion Plug-in (Exclusion) v0.10
Jenkins GIT client plugin (git-client) v1.11.1
Jenkins GIT plugin (git) v2.3
Jenkins Gradle plugin (gradle) v1.24
Jenkins Gravatar plugin (gravatar) v2.1
Jenkins Job Configuration History Plugin (jobConfigHistory) v2.10
Jenkins jQuery plugin (jquery) v1.7.2-1
Jenkins Mailer Plugin (mailer) v1.12
Jenkins Multijob plugin (jenkins-multijob-plugin) v1.15
Jenkins Parameterized Trigger plugin (parameterized-trigger) v2.25
Jenkins Priority Sorter Plugin (PrioritySorter) v2.9
Jenkins promoted builds plugin (promoted-builds) v2.19
Jenkins SSH Slaves plugin (ssh-slaves) v1.9
Jenkins Subversion Plug-in (subversion) v2.4.5
Jenkins TextFinder plugin (text-finder) v1.10
Jenkins Throttle Concurrent Builds Plug-in (throttle-concurrents) v1.8.4
Jenkins Translation Assistance plugin (translation) v1.12
Job Import Plugin (job-import-plugin) v1.2
Join plugin (join) v1.15
JUnit Plugin (junit) v1.2
LDAP Plugin (ldap) v1.11
MapDB API Plugin (mapdb-api) v1.0.6.0
Matrix Authorization Strategy Plugin (matrix-auth) v1.2
Matrix Project Plugin (matrix-project) v1.4
Maven Integration plugin (maven-plugin) v2.7.1
Monitoring (monitoring) v1.53.1
Next Build Number Plugin (next-build-number) v1.1
Node and Label parameter plugin (nodelabelparameter) v1.5.1
NodeJS Plugin (nodejs) v0.2.1
OWASP Markup Formatter Plugin (antisamy-markup-formatter) v1.3
PAM Authentication plugin (pam-auth) v1.2
Performance plugin (performance) v1.11
Plain Credentials Plugin (plain-credentials) v1.1
Plot plugin (plot) v1.8
Publish Over SSH (publish-over-ssh) v1.12
Run Condition Plugin (run-condition) v1.0
SCM API Plugin (scm-api) v0.2
Script Security Plugin (script-security) v1.13
Semantic Versioning Plugin (semantic-versioning-plugin) v1.7
Simple Theme Plugin (simple-theme-plugin) v0.3
SSH Agent Plugin (ssh-agent) v1.5
SSH Credentials Plugin (ssh-credentials) v1.10
Token Macro Plugin (token-macro) v1.10
Windows Slaves Plugin (windows-slaves) v1.0
Workflow: Step API (workflow-step-api) v1.2
xUnit plugin (xunit) v1.92
youtrack-plugin (youtrack-plugin) v0.6.3

Reproduction:

1) Add a slave node to a master connecting via JNLP.
2) Start slave.jar on the slave machine.
3) On the master, view the new node's system information and note the PATH environment variable.
4) Kill slave.jar.
5) Change the PATH on the slave machine.
6) Restart slave.jar.
7) Look again at the node's system information on the master and note the PATH variable has not changed.

timja commented 9 years ago

danielbeck:

Weird. For me the only affected variable seems to be PATH.

timja commented 9 years ago

danielbeck:

Could you confirm the above on your environment?

Is this a regression (i.e. did it ever work for you)? If so, what is the newest version know to not be affected?

timja commented 9 years ago

siggimoo:

I just used PATH as an example. In my actual case I had three custom variables set on the slave machine. When I removed them and restarted slave.jar, those variables remained. I made other changes to test my suspicions (e.g. altering PATH) and they too remained unaltered.

It's not a regression to my knowledge. We've never needed to change slave variables before.

timja commented 9 years ago

danielbeck:

My basic 1.607 without any custom plugins had different behavior: Only PATH didn't change (unfortunately I didn't confirm it was set correctly in the shell, so may have been user error), the new variables I defined were correctly updated.

Could you test whether the same issue occurs in the following configurations:

There was a specific change in 1.600 related to slave caching, but it should clear slave environment after restarting.

As there are, and were, numerous issues with env-inject being weird, knowing whether it affects things either depending on or independent of Jenkins version would be very helpful.

timja commented 9 years ago

siggimoo:

Looks like the problem might have occurred around v1.606. For each of the tests I conducted the following procedure:

  1. Started with the following environment variables on the slave machine
    • PATH = /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin
    • SOME_ENVVAR = value
  2. Launched java -jar jenkins.war on the master machine
  3. Added a new node to the master
  4. Started slave.jar on the slave
  5. Ensured PATH and SOME_ENVVAR showed up on master
  6. Killed slave.jar
  7. Appended :/nonexistent to PATH
  8. Removed SOME_ENVVAR
  9. Ensured variables changed on slave machine
  10. Restarted slave.jar
  11. Refreshed node's system-information page on master

The following were the results:

timja commented 9 years ago

funeeldy:

we are seeing the same behavior, although I did not try to delete the slave and recreate it. We keep having to restart the jenkins server. I also tried to uninstall the envinject plugin, but we have a multijob plugin that depends on it. When I restarted jenkins server, none of my jobs would load due to the missing envinject plugin, so I had to reinstall it.

timja commented 9 years ago

sumdumgai:

Verified the same behavior on Jenkins 1.611 with EnvInject.

Changing a slave node's PATH has no effect on what Jenkins sets the PATH to during a build even after a slave service restart. Restarting the slave operating system still has no effect, the PATH is not refreshed. Restarting the master Jenkins server finally did allow PATH to get refreshed.

The server is definitely caching an old PATH and then re-assigning it (via EnvInject?) to the slave node process. This is very bad.

Since changing environment variables necessitates at least a slave process restart, it would seem the proper fix is to refresh the master server's cache of a node's environment during slave-service startup.

timja commented 9 years ago

danielbeck:

Could I get confirmation from someone that this issue only occurs when envinject is installed? Milo, could you try "v1.606 without EnvInject" as well?

timja commented 9 years ago

siggimoo:

Just tried the above procedure on v1.606 without EnvInject. Problem still occurred.

timja commented 9 years ago

siggimoo:

I've only just started looking at the source, so I may be off base here. But it seems to me that hudson.model.Computer is taking a snapshot upon first request and then never querying the slave again:

public EnvVars getEnvironment() throws IOException, InterruptedException {
EnvVars cachedEnvironment = this.cachedEnvironment;
if (cachedEnvironment != null) {
    return new EnvVars(cachedEnvironment);
}

cachedEnvironment = EnvVars.getRemote(getChannel());
this.cachedEnvironment = cachedEnvironment;
return new EnvVars(cachedEnvironment);
    }

Looking at the context for the introduction of this code I understand there was a desire to minimize network calls, but I would think there are times when the cost is worth it. At the start of a build, for instance, or when the system-information is requested. These are relative one-offs that would not unduly suffer from a second or two of delay.

timja commented 9 years ago

wilcobt:

I just ran into this issue as well after updating one of our tools to a new version that required an update of an environment variable to find it. For now as a workaround I have overruled the variable in the agent settings and that seems to work.

timja commented 9 years ago

david_rubio:

We have the same problem with Jenkins 1.613 (and older versions too) with EnvInject 1.90. This is a pretty bad issue for us because we keep changing the variables of one of our slaves when installing new libraries. Hopefully it'll be fixed soon.

timja commented 9 years ago

bullhornrelease:

We have envinject 1.91.2 and jenkins 1.606 here (CentOS master, windows 7 slave) and I see the same behaviors. The path the node has is clearly onle from an earlier configuration, and doesn't include some new items from the system path. I have changed the configuration to NOT unset the system environment variables at the global and node/slave level, to no avail. I have not tried recreating the slave (I will try that now).

Actually, I just tried to reset the node configuration settings, and I see under "Prepare jobs environment" that the "Unset System Environment Variables" setting is checked. Same for the global "Prepare jobs environment" -> "Unset System Environment Variables" check box.

timja commented 9 years ago

bullhornrelease:

Recreating the slave picked up the new PATH settings.

timja commented 9 years ago

sumdumgai:

This issue is independent of EnvInject - I consistently reproduce this on Jenkins 1.6.09+ without EnvInject.

This is a very annoying bug indeed.

timja commented 9 years ago

giladba:

I have the same issue.
Adding PATH=;%PATH% to a batch build step fixed it.
Also uninstalling the slave using :
installUtil /u c:\jenkins\jenkins-slave.exe
sc delete jenkinsslave-c__jenkins
and then recreating the slave fixed it.

This makes it very hard to manage changes to PATH in the machine, I hope it will be resolved soon.

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Daniel Beck
Path:
core/src/main/java/hudson/model/Computer.java
http://jenkins-ci.org/commit/jenkins/c569036fca43f286ebbb80498f3f0937766e44c5
Log:
[FIX JENKINS-27739] Clear cached env vars when node goes online

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Daniel Beck
Path:
core/src/main/java/hudson/model/Computer.java
http://jenkins-ci.org/commit/jenkins/c6d4204af1db78a6b1ba8cdd5f40f61756a1d009
Log:
JENKINS-27739 Ensure cache clearing listener runs first

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Daniel Beck
Path:
core/src/main/java/hudson/model/Computer.java
http://jenkins-ci.org/commit/jenkins/31208c3dc2682fe04f5008dcd794fb39e41d6b13
Log:
Merge pull request #1728 from daniel-beck/JENKINS-27739

[FIX JENKINS-27739] Clear cached env vars when node goes online

Compare: https://github.com/jenkinsci/jenkins/compare/9b3039295edb...31208c3dc268

timja commented 9 years ago

danielbeck:

Merged towards 1.617.

timja commented 9 years ago

dogfood:

Integrated in jenkins_main_trunk #4168

Result = UNSTABLE

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Daniel Beck
Path:
core/src/main/java/hudson/model/Computer.java
http://jenkins-ci.org/commit/jenkins/aaf0afc54d4e320e2d75a4af6a263745d9de9928
Log:
[FIX JENKINS-27739] Clear cached env vars when node goes online

(cherry picked from commit c569036fca43f286ebbb80498f3f0937766e44c5)

timja commented 9 years ago

scm_issue_link:

Code changed in jenkins
User: Daniel Beck
Path:
core/src/main/java/hudson/model/Computer.java
http://jenkins-ci.org/commit/jenkins/ba69511c4775bb72feada0b2c0d3fd299f5179a7
Log:
JENKINS-27739 Ensure cache clearing listener runs first

(cherry picked from commit c6d4204af1db78a6b1ba8cdd5f40f61756a1d009)

timja commented 9 years ago

danielbeck:

jwinch 1.609.2-fixed means it has already been picked up for LTS inclusion.

timja commented 9 years ago

julrich:

Restarting the master also updated the variables for me with Jenkins 1.615.
Depending on the setup, this might be the easier workaround than recreating the slave.

timja commented 9 years ago

siggimoo:

julrich: Restarting the master impacts ALL slaves, not just one. It brings the entire system to a screeching halt.

timja commented 9 years ago

moshe_zvi:

Still happening on 1.619. Restarting the master indeed fixed it, but that's a bad resolution.
It would be better for this to remain in status Opened for others who encounter the same issue.

timja commented 9 years ago

funeeldy:

I agree with Moshe.

timja commented 9 years ago

danielbeck:

moshe_zvi How is the slave connected to Jenkins? JNLP, SSH slave, …? Does the issue occur with the Env-Inject plugin not installed?

timja commented 9 years ago

guss77:

I have the same problem with EC2 on demand Windows slaves. The slaves are installed and launched through WinRM ( I believe as a standard Java process).

timja commented 9 years ago

giladba:

Update : it works if I delete and create the slaves again but that is a real pain.
currently i just add an inject variable stage and i add path=;%path%

timja commented 9 years ago

kmleinen:

My team has also been experiencing this issue. Our server is running from RedHat Ent. Linux 6.5/Tomcat7 and we are seeing this on nodes that are both WinXP or Win7. We are not seeing this on nodes that are linux-based.

One important item we saw and have confirmed is that new environment variables get added/update/remove just fine, but environment variables at the time the node was first connected seem to be locked in and won't update/remove. Our current workaround is to delete the node in Jenkins and recreate it.

timja commented 9 years ago

danielbeck:

Issue needs to occur without envinject installed to be a proven core issue (as was the problem that was fixed here a while back).

Please include much more information and don't just reopen without providing any. This guideline may help. And please make sure that your issue is identical to what was originally reported.

timja commented 9 years ago

dogfood:

Integrated in jenkins_main_trunk #4292
[FIX JENKINS-27739] Clear cached env vars when node goes online (Revision aaf0afc54d4e320e2d75a4af6a263745d9de9928)
JENKINS-27739 Ensure cache clearing listener runs first (Revision ba69511c4775bb72feada0b2c0d3fd299f5179a7)

Result = UNSTABLE
ogondza : aaf0afc54d4e320e2d75a4af6a263745d9de9928
Files :

ogondza : ba69511c4775bb72feada0b2c0d3fd299f5179a7
Files :

timja commented 9 years ago

jwinch:

I have a simple job that reports the slave’s environment variable, I have run this on our production server and can confirm that the environment variable never changes unless the slave is deleted and then re added.

This is being seen on LTS 1.609.1 – we know this was working in LTS 1.565.1, when we upgraded directly to 1.609.1 this issue started occurring.

Using the test system I have confirmed that the issue is still reproducible in the same manner as on the live system.
On the test system I have upgraded the EnvInject plugin from 1.91.3 to the latest version 1.92.1 and that did not fix it.
On the test system I upgraded to the latest LTS version 1.609.3, that did not fix it.
On the test system I removed the EnvInject plugin and that did not fix it.
On the test system I have rolled back to 1.596.3 by replacing the Jenkins.war file, at this point the environment variable is still not being picked up correctly.
On the test system I have upgraded to the latest LTS release 1.609.3 with the EnvInject plugin removed and that did not fix it.
On the test system I have upgraded to the latest release 1.630 with the EnjInject plugin removed and that did not fix it.

The master is an Ubuntu 12.04 64 bit vm, Jenkins has been installed from the Ubuntu package via apt, the slave used in this test is a Windows 7 x64 VM, I am configuring the slave and master + starting the builds via the latest version of Firefox.

I have installed the Support Core Plugin and the support zip from this is attached.

timja commented 9 years ago

danielbeck:

This fix was merged into 1.609.2 and 1.617. That's why this issue is labeled '1.609.2-fixed'. Experiments using 1.609.1 are irrelevant.

(Not sure why the LTS changelog didn't pick this up, but Kohsuke uses some weird generator for that, so I cannot even modify it to report the truth afterwards at the moment – this is being tracked as INFRA-264.)

timja commented 9 years ago

dynite:

With Jenkins ver. 1.631 and 1.617, I see this issue.

Slave has been deleted / recreated a few times but it makes no difference.
I have envinject plugin version 1.92.1.

I have tried unticking the 'Unset environment variables' on the node setup page - makes no difference.
Killed master and restarted.
Disconnected slave and restarted.

This was working a couple of weeks ago and there's been no changes, to my recollection.

If I ssh in to the box and do echo $myvar, the value appears set, when when the job run, the value is unset (running as the same user)

EDIT - slave is a mac, with the value set in ~/.bash_profile

timja commented 9 years ago

danielbeck:

If this issue only occurs when envinject is installed, it's not this bug, and likely not a bug in core at all. Please look for a bug filed against envinject.

timja commented 8 years ago

zionyx:

This issue still persists on my 1.606 instance with EnvInject uninstalled. Had both server and slaves rebooted without avail.
I am more inclined to think that this issue is with Jenkins' core than with EnvInject.

timja commented 8 years ago

danielbeck:

still persists on my 1.606 instance

This is to be expected, as this issue was fixed in 1.617.

timja commented 8 years ago

batmat:

FWIW, (very) similar issue seen on Jenkins 1.625.2. PATH was modified in the agent /configure, but wasn't taken in account until disconnecting/reconnecting the agent. Other variables were immediately propagated.

Not sure I'll have time to get the bottom of it in the next few days to file a complete new issue, but in case someone comes here, maybe that'll help.

EnvInject 1.92.1.

Agent connected using ssh-slaves 1.10. IBM AIX agent.

timja commented 8 years ago

kobihk:

I updated environment variable but didn't see the updated value just after disconnect and reconnect the agent.

We use Jenkins 1.656

timja commented 7 years ago

nehaljwani:

Please note that bringing the slave online/offline doesn't help. One has to disconnect the slave and relaunch the slave agent. This worked for me for Jenkins version: 1.651.2

timja commented 7 years ago

paris_bp2s_ci_project:

Hello,

Same as Nehal, I have the problem with Jenkins 1.651.1.

timja commented 7 years ago

cryptomatt:

Am seeing this behaviour with Jenkins 2.46.3 

Does not disappear even if slave is disconnected and reconnected

Does not go if slave is deleted and reset with a different name

timja commented 7 years ago

oleg_nenashev:

If somebody hits this issue, please consider reopening it or creating a new one. The latter may be preferable from the triaging perspective.

timja commented 4 years ago

markoned:

There still this behavior, within Jenkins version Jenkins ver. 2.190.1

I tried node deleting / creating, Jenkins restarting... Nothing helped    

 

timja commented 4 years ago

markoned:

this issue still exist in the Jenkins version Jenkins ver. 2.190.1