jenkinsci / configuration-as-code-plugin

Jenkins Configuration as Code Plugin
https://plugins.jenkins.io/configuration-as-code
MIT License
2.69k stars 718 forks source link

Don't let unknown keywords stop Jenkins startup #1581

Closed proski closed 3 years ago

proski commented 3 years ago

Your checklist for this issue

Description

After I removed a plugin (Stash Pull Request Builder) and restarted Jenkins, the top level Jenkins URL showed a page with a stack trace:

io.jenkins.plugins.casc.ConfiguratorException: Invalid configuration elements for type class jenkins.model.GlobalConfigurationCategory$Unclassified : stashBuildTrigger.
Available attributes : administrativeMonitorsConfiguration, artifactManager, bitbucketEndpointConfiguration, buildDiscarders, casCGlobalConfig, defaultFolderConfiguration, defaultView, email-ext, envVarsFilter, fingerprints, gitHubConfiguration, gitHubPluginConfig, gitSCM, globalConfigFiles, globalDefaultFlowDurabilityLevel, globalLibraries, jiraGlobalConfiguration, junitTestResultStorage, location, lockableResourcesManager, mailer, masterBuild, myView, nodeProperties, pipeline-model-docker, plugin, pollSCM, projectNamingStrategy, quietPeriod, resourceRoot, scmRetryCount, shell, timestamper, usageStatistics, viewsTabBar
    at io.jenkins.plugins.casc.BaseConfigurator.handleUnknown(BaseConfigurator.java:376)
    at io.jenkins.plugins.casc.BaseConfigurator.configure(BaseConfigurator.java:365)
    at io.jenkins.plugins.casc.BaseConfigurator.check(BaseConfigurator.java:287)
    at io.jenkins.plugins.casc.ConfigurationAsCode.lambda$checkWith$8(ConfigurationAsCode.java:753)
    at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:689)
Caused: io.jenkins.plugins.casc.ConfiguratorException: unclassified: error configuring 'unclassified' with class io.jenkins.plugins.casc.impl.configurators.GlobalConfigurationCategoryConfigurator configurator
    at io.jenkins.plugins.casc.ConfigurationAsCode.invokeWith(ConfigurationAsCode.java:695)
    at io.jenkins.plugins.casc.ConfigurationAsCode.checkWith(ConfigurationAsCode.java:753)
    at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:738)
    at io.jenkins.plugins.casc.ConfigurationAsCode.configureWith(ConfigurationAsCode.java:614)
    at io.jenkins.plugins.casc.ConfigurationAsCode.configure(ConfigurationAsCode.java:298)
    at io.jenkins.plugins.casc.ConfigurationAsCode.init(ConfigurationAsCode.java:290)
Caused: java.lang.reflect.InvocationTargetException
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.base/java.lang.reflect.Method.invoke(Method.java:566)
    at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:104)
Caused: java.lang.Error
    at hudson.init.TaskMethodFinder.invoke(TaskMethodFinder.java:110)
    at hudson.init.TaskMethodFinder$TaskImpl.run(TaskMethodFinder.java:175)
    at org.jvnet.hudson.reactor.Reactor.runTask(Reactor.java:296)
    at jenkins.model.Jenkins$5.runTask(Jenkins.java:1130)
    at org.jvnet.hudson.reactor.Reactor$2.run(Reactor.java:214)
    at org.jvnet.hudson.reactor.Reactor$Node.run(Reactor.java:117)
    at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
    at java.base/java.lang.Thread.run(Thread.java:834)
Caused: org.jvnet.hudson.reactor.ReactorException
    at org.jvnet.hudson.reactor.Reactor.execute(Reactor.java:282)
    at jenkins.InitReactorRunner.run(InitReactorRunner.java:49)
    at jenkins.model.Jenkins.executeReactor(Jenkins.java:1163)
    at jenkins.model.Jenkins.<init>(Jenkins.java:961)
    at hudson.model.Hudson.<init>(Hudson.java:85)
    at hudson.model.Hudson.<init>(Hudson.java:81)
    at hudson.WebAppMain$3.run(WebAppMain.java:295)
Caused: hudson.util.HudsonFailedToLoad
    at hudson.WebAppMain$3.run(WebAppMain.java:312)

I could upload the config without "stashBuildTrigger", but I had to log in to the Jenkins system to restart it. The URL with "/safeRestart" would just show the same stack trace.

I believe it would be much better if the Configuration ad Code Plugin would just record the issue and continue. There are other ways to alert the user of obsolete parts of the configuration. It's possible to log an error and show it in the administrative monitor.

The issue is not new, I hit it a few times in the past.

Removing plugins should not be a minefield. Jenkins itself is much more gracious when some code is missing to handle the a configuration file.

jetersen commented 3 years ago

@proski You can configure this. :) Although I disagree with the option, removing a plugin should remind you that you have unused config. Perhaps setup a docker test instance for maintaining a working copy before applying changes?

https://github.com/jenkinsci/configuration-as-code-plugin/blob/25b26febfc4aa01ec0abde0295763475f7ffee4b/test-harness/src/test/resources/io/jenkins/plugins/casc/validSelfConfig.yml

https://github.com/jenkinsci/configuration-as-code-plugin/blob/223ac074c0dec8e6c5812acefd02c97c518a3903/plugin/src/main/java/io/jenkins/plugins/casc/ConfigurationContext.java#L171-L184

proski commented 3 years ago

Good to know. Then the issue is about poor defaults. I don't see a reasonable justification for the current default behavior.

jetersen commented 3 years ago

I see no reason to change the default behavior. User is free to change them at anytime. You ought to have a test instance or a dev setup where you can apply changes before breaking the production setup. It is good practice to know the necessary steps ahead of time.

The way we use JCASC may be different but again JCASC is an opinionated way of how to run your Jenkins instance.

WDYT @oleg-nenashev @timja

proski commented 3 years ago

It's hard to argue against the "opinionated", of course. There are different use cases. In my case, Jenkins is used by a very small team (one developer and three observers who don't write code and don't touch Jenkins configuration), and the downtime is permissible. Having an extra server would add extra work for that small team for very little benefit. It's a common situation that a plugin is updated and the new version cannot read some of the data created by the old version. Jenkins doesn't stop loading because of that. It's a similar situation here. A plugin is removed. It could also be a plugin upgrade that changes one of the Jenkins-wide settings. I would expect a similar behavior. Instead, I'm left with a server that cannot even be rebooted over the web interface. Also, the configuration-as-code section doesn't appear in the downloaded configuration. It's hard to discover. I would have changed the default for my Jenkins server if I as aware of the settings.

jetersen commented 3 years ago

No one said an extra server, spin your Jenkins instance into a container :)

proski commented 3 years ago

I have an experimental Jenkins install in a container, but I don't want to do everything twice, especially routine plugin updates and cleanups. Anyway, we digress. My point is that Jenkins failing to load because an unused plugin has been removed is:

oleg-nenashev commented 3 years ago

I support keeping the current default behavior as-is. System configuration may contain mission-critical bits, e.g. security settings. Skipping such settings may lead to instances not working as expected.

There is no way to distinguish critical and non-critical settings at the moment. XStream has such engine for serialization, but I doubt it can be reused in JCasC.

What we could do is improving developer tools so that users can verify configs before applying them

proski commented 3 years ago

Also, it would be nice to make JCasC own settings discoverable. They are not part of the downloaded configuration even if they are set to non-default values.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.