cloudfoundry / bosh

Cloud Foundry BOSH is an open source tool chain for release engineering, deployment and lifecycle management of large scale distributed services.
https://bosh.io
Apache License 2.0
2.03k stars 658 forks source link

Failing to render template when running cloud check #2175

Closed mariash closed 5 years ago

mariash commented 5 years ago

Describe the bug When running bosh cck we are seeing the following error:

bosh -d cf cck                                                                                                                                                                                                                                                                       
Using environment 'https://10.0.0.6:25555' as client 'admin'                                                                                                                                                                                                                                         

Using deployment 'cf'                                                                                                                                                                                                                                                                                

Task 38325                                                                                                                                                                                                                                                                                           

Task 38325 | 19:08:16 | Scanning 33 VMs: Checking VM states (00:00:14)                                                                                                                                                                                                                               
Task 38325 | 19:08:30 | Scanning 33 VMs: 32 OK, 1 unresponsive, 0 missing, 0 unbound (00:00:00)                                                                                                                                                                                                      
Task 38325 | 19:08:30 | Scanning 4 persistent disks: Looking for inactive disks (00:00:01)                                                                                                                                                                                                           
Task 38325 | 19:08:31 | Scanning 4 persistent disks: 4 OK, 0 missing, 0 inactive, 0 mount-info mismatch (00:00:00)                                                                                                                                                                                   

Task 38325 Started  Mon Apr 15 19:08:16 UTC 2019                                                                                                                                                                                                                                                     
Task 38325 Finished Mon Apr 15 19:08:31 UTC 2019                                                                                                                                                                                                                                                     
Task 38325 Duration 00:00:15                                                                                                                                                                                                                                                                         
Task 38325 done                                                                                                                                                                                                                                                                                      

#    Type                Description                                                                                                                                                                                                                                                                 
125  unresponsive_agent  VM for 'scheduler/8c75d419-e428-4f44-9b08-f1865e87e016 (0)' with cloud ID 'vm-06a5d414-c846-4a42-5e7f-f6207e9df60c' is not responding.                                                                                                                                      

1 problems                                                                                                                                                                                                                                                                                           

1: Skip for now
2: Reboot VM
3: Recreate VM without waiting for processes to start
4: Recreate VM and wait for processes to start
5: Delete VM
6: Delete VM reference (forceful; may need to manually delete VM from the Cloud to avoid IP conflicts)
VM for 'scheduler/8c75d419-e428-4f44-9b08-f1865e87e016 (0)' with cloud ID 'vm-06a5d414-c846-4a42-5e7f-f6207e9df60c' is not responding. (1): 4

Continue? [yN]: y

Task 38326

Task 38326 | 19:08:42 | Applying problem resolutions: VM for 'scheduler/8c75d419-e428-4f44-9b08-f1865e87e016 (0)' with cloud ID 'vm-06a5d414-c846-4a42-5e7f-f6207e9df60c' is not responding. (unresponsive_agent 1593): Recreate VM and wait for processes to start (00:02:18)
                     L Error: - Unable to render jobs for instance group 'scheduler'. Errors are:
  - Unable to render templates for job 'cc_deployment_updater'. Errors are:
    - Error filling in template 'cloud_controller_ng.yml.erb' (line 38: undefined method `yaml_escape' for #<Bosh::Template::EvaluationContext:0x000055dff4b8dee8>)
Task 38326 | 19:11:00 | Error: Error resolving problem '125': - Unable to render jobs for instance group 'scheduler'. Errors are:
  - Unable to render templates for job 'cc_deployment_updater'. Errors are:
    - Error filling in template 'cloud_controller_ng.yml.erb' (line 38: undefined method `yaml_escape' for #<Bosh::Template::EvaluationContext:0x000055dff4b8dee8>)

Task 38326 Started  Mon Apr 15 19:08:42 UTC 2019
Task 38326 Finished Mon Apr 15 19:11:00 UTC 2019
Task 38326 Duration 00:02:18
Task 38326 error

Resolving problems for deployment 'cf':
  Expected task '38326' to succeed but state is 'error'

Exit code 1

To Reproduce

  1. Deploy capi-release https://github.com/cloudfoundry/capi-release
  2. Make scheduler to fail (e.g. kill the agent process on VM)
  3. Run bosh cck and choose Recreate VM

Expected behavior

VM gets recreated with no errors.

Logs

D, [2019-04-15T19:11:00.096625 #12609] [task:38326] DEBUG -- DirectorJobRunner: (0.000593s) (conn: 47210196052720) UPDATE "tasks" SET "event_output" = ("event_output" || '{"time":1555355460,"error":{"code":100,"message":"Error resolving problem ''125'': - Unable to render jobs for instance group ''scheduler''. Errors are:\n  - Unable to render templates for job ''cc_deployment_updater''. Errors are:\n    - Error filling in template ''cloud_controller_ng.yml.erb'' (line 38: undefined method `yaml_escape'' for #<Bosh::Template::EvaluationContext:0x000055dff4b8dee8>)"}}              
') WHERE ("id" = 38326)
D, [2019-04-15T19:11:00.098451 #12609] [task:38326] DEBUG -- DirectorJobRunner: (0.001686s) (conn: 47210196052720) COMMIT
E, [2019-04-15T19:11:00.098706 #12609] [task:38326] ERROR -- DirectorJobRunner: Error resolving problem '125': - Unable to render jobs for instance group 'scheduler'. Errors are:                                                                                                                   
  - Unable to render templates for job 'cc_deployment_updater'. Errors are:
    - Error filling in template 'cloud_controller_ng.yml.erb' (line 38: undefined method `yaml_escape' for #<Bosh::Template::EvaluationContext:0x000055dff4b8dee8>)                                                                                                                                  
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/jobs/cloud_check/apply_resolutions.rb:39:in `block in perform'                                                                                              
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/lock_helper.rb:13:in `block in with_deployment_lock'                                                                                                        
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/lock.rb:79:in `lock'                                                                                                                                        
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/lock_helper.rb:13:in `with_deployment_lock'                                                                                                                 
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/jobs/cloud_check/apply_resolutions.rb:35:in `perform'                                                                                                       
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:99:in `perform_job'                                                                                                                           
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:34:in `block in run'                                                                                                                          
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh_common-0.0.0/lib/common/thread_formatter.rb:52:in `with_thread_name'                                                                                                                         
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/job_runner.rb:34:in `run'                                                                                                                                   
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/jobs/base_job.rb:10:in `perform'                                                                                                                            
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/jobs/db_job.rb:36:in `block in perform'                                                                                                                     
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/bosh-director-0.0.0/lib/bosh/director/jobs/db_job.rb:83:in `block (3 levels) in run'                                                                                                              
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/eventmachine-1.2.5/lib/eventmachine.rb:1076:in `block in spawn_threadpool'                                                                                                                        
/var/vcap/data/packages/director/b99294ed5b639acac4dbc3338c8373de4b7b7e05/gem_home/ruby/2.4.0/gems/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'                                                                                                     
D, [2019-04-15T19:11:00.099657 #12609] [task:38326] DEBUG -- DirectorJobRunner: (0.000296s) (conn: 47210193977240) SELECT * FROM "tasks" WHERE "id" = 38326                                                                                                                                          
D, [2019-04-15T19:11:00.101054 #12609] [task:38326] DEBUG -- DirectorJobRunner: (0.000054s) (conn: 47210196052720) BEGIN
D, [2019-04-15T19:11:00.102198 #12609] [task:38326] DEBUG -- DirectorJobRunner: (0.000486s) (conn: 47210196052720) UPDATE "tasks" SET "state" = 'error', "timestamp" = '2019-04-15 19:11:00.100569+0000', "description" = 'apply resolutions', "result" = 'Error resolving problem ''125'': - Unable to render jobs for instance group ''scheduler''. Errors
  - Unable to render...', "output" = '/var/vcap/store/director/tasks/38326', "checkpoint_time" = '2019-04-15 19:10:42.736665+0000', "type" = 'cck_apply', "username" = 'admin', "deployment_name" = 'cf', "started_at" = '2019-04-15 19:08:42.710522+0000', "event_output" = '{"time":1555355322,"stage":"Applying problem resolutions","tags":[],"total":1,"task":"VM for ''scheduler/8c75d419-e428-4f44-9b08-f1865e87e016 (0)'' with cloud ID ''vm-06a5d414-c846-4a42-5e7f-f6207e9df60c'' is not responding. (unresponsive_agent 1593): Recreate VM and wait for processes to start","index":1,"state":"started","progress":0}
{"time":1555355460,"stage":"Applying problem resolutions","tags":[],"total":1,"task":"VM for ''scheduler/8c75d419-e428-4f44-9b08-f1865e87e016 (0)'' with cloud ID ''vm-06a5d414-c846-4a42-5e7f-f6207e9df60c'' is not responding. (unresponsive_agent 1593): Recreate VM and wait for processes to start","index":1,"state":"failed","progress":100,"data":{"error":"- Unable to render jobs for instance group ''scheduler''. Errors are:\n  - Unable to render templates for job ''cc_deployment_updater''. Errors are:\n    - Error filling in template ''cloud_controller_ng.yml.erb'' (line 38: undefined method `yaml_escape'' for #<Bosh::Template::EvaluationContext:0x000055dff4b8dee8>)"}}
{"time":1555355460,"error":{"code":100,"message":"Error resolving problem ''125'': - Unable to render jobs for instance group ''scheduler''. Errors are:\n  - Unable to render templates for job ''cc_deployment_updater''. Errors are:\n    - Error filling in template ''cloud_controller_ng.yml.erb'' (line 38: undefined method `yaml_escape'' for #<Bosh::Template::EvaluationContext:0x000055dff4b8dee8>)"}}
', "result_output" = '', "context_id" = '' WHERE ("id" = 38326)
D, [2019-04-15T19:11:00.126983 #12609] [task:38326] DEBUG -- DirectorJobRunner: (0.024578s) (conn: 47210196052720) COMMIT
I, [2019-04-15T19:11:00.127404 #12609] []  INFO -- DirectorJobRunner: Task took 2 minutes 17.425171610000007 seconds to process.

Versions (please complete the following information):

bosh env
Using environment 'https://10.0.0.6:25555' as client 'admin'

Name               bosh-electron-cannon
UUID               68bdb453-daaa-40ac-a97e-15fdcac82187
Version            268.6.0 (00000000)
Director Stemcell  ubuntu-xenial/250.9
CPI                google_cpi
Features           compiled_package_cache: disabled
                   config_server: enabled
                   local_dns: enabled
                   power_dns: disabled
                   snapshots: disabled
User               admin

Succeeded

Deployment info:

Using cf-deployment-concourse task: https://github.com/cloudfoundry/cf-deployment-concourse-tasks/blob/master/bosh-deploy-with-created-release/task.yml

With ops files:

../cf-deployment/operations/use-compiled-releases.yml                                                                                                                                                                                                                                       
../cf-deployment/operations/experimental/use-compiled-releases-windows.yml                                                                                                                                                                                                                  
../release/operations/add-vizzini-errand.yml                                                                                                                                                                                                                                                
../cf-deployment/operations/test/add-datadog-firehose-nozzle.yml                                                                                                                                                                                                                            
../cf-deployment/operations/scale-database-cluster.yml                                                                                                                                                                                                                                      
../cf-deployment/operations/use-external-blobstore.yml                                                                                                                                                                                                                                      
../cf-deployment/operations/use-gcs-blobstore-service-account.yml                                                                                                                                                                                                                           
../cf-deployment/operations/windows2012R2-cell.yml                                                                                                                                                                                                                                          
../cf-deployment/operations/community/change-metron-agent-deployment.yml                                                                                                                                                                                                                    
../cf-deployment/operations/addons/enable-component-syslog.yml                                                                                                                                                                                                                              
../cf-deployment/operations/addons/component-syslog-custom-ca.yml                                                                                                                                                                                                                           
../cf-deployment/operations/experimental/windows-enable-component-syslog.yml                                                                                                                                                                                                                
../cf-deployment/operations/experimental/windows-component-syslog-ca.yml 

Additional context

cf-gitbot commented 5 years ago

We have created an issue in Pivotal Tracker to manage this:

https://www.pivotaltracker.com/story/show/165366531

The labels on this github issue will be updated when the story is started.

h4xnoodle commented 5 years ago

Hi @mariash it seems there could be an error in the template:

Error filling in template 'cloud_controller_ng.yml.erb' (line 38: undefined methodyaml_escape' for #)`

Is there a yaml_escape being called in cloud_controller_ng.yml.erb?

Looking at the current template https://github.com/cloudfoundry/capi-release/blob/develop/jobs/cloud_controller_ng/templates/cloud_controller_ng.yml.erb I see that there is a yaml_escape defined. What version of the CAPI release are you using?

h4xnoodle commented 5 years ago

Closing due to inactivity. Please reopen if this issue persists or is seen on a later version of the director.

charleshansen commented 5 years ago

Looks like it really is a subtle issue in the template. I filed an issue with capi to fix it

voelzmo commented 5 years ago

Thanks for digging this up, @charleshansen! I was really curious about what happened here.