trimble-oss / dba-dash

DBA Dash - SQL Server Monitoring Tool
MIT License
260 stars 61 forks source link

Strange job name on Agent Jobs tab #373

Closed acappelletti closed 2 years ago

acappelletti commented 2 years ago

Hi, I want to point out that under the jobs TAB I do not see the list of real names that is in the SQL SERVER AGENT but simply the JOB_IDs (which I understand to be the reports) but none of those in place that run regularly.

For example I see job name like "!!NEW!! 9FDD56EF-2BCB-4CFC-8403-F4ADF54F92ED" insted of "DBA: index routine".

I placed 2 instances on my VM (one default and the other named). In the second I see the names correctly.

image

Alen

DavidWiseman commented 2 years ago

Hi,

The agent job history is collected every 1min with this query. The job metadata including the job name is collected every hour. The "!!NEW!! {Guid}" names are expected if the jobs have recently been created. You can check when the collection for the job metadata ran by navigating to the instance, expanding the node in the tree, and clicking the "Checks" node. Then select the "Collection Dates" tab. The collection is the "Jobs" collection. Most collections will run when the service starts so stopping/starting the service is an option to trigger the collections.

image

Hopefully, this explains the situation and the names will populate soon. The Jobs collection uses SMO which can be a bit chatty. I noticed this collection was VERY slow when pulling job data over the VPN for an instance with a large number of SSRS jobs. The same collection worked fine running on a server locally - but the network latency over the VPN was a problem. I've got an issue to look at options to improve this.

If you think you have a problem with the Jobs collection, let me know. Check the "DBA Dash ErrorLog" tab for any errors reported for this collection.

Hope this helps,

David

acappelletti commented 2 years ago

Thanks David for you response... I try to give you some details in addition to the contents of your post.

I got on my local notebook (my lab) 2 instance. One default and another named. I see on the first that I haven't "JOB" but only "JOB History" while in the second I have both... and this explain to me 'cause always have "!!NEW!!" (are sorted on that column naturally)

image

Now it's clear to me the explanation about one hour late for the name of the job in the second they appeared, but for the default instance no!

I just checked into "Dash Error log" and always got 4 line for every restert

image

Could be something related to this 4 lines?

-------- Collect error ----------

Microsoft.SqlServer.Management.Smo.FailedOperationException: Script failed for Job '129EFE82-0209-4678-9B9D-711B2BF34C4A'. 
 ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.SqlServer.Management.Smo.Agent.Job.GetJobScriptingScript(StringBuilder createQuery, ScriptingPreferences sp)
   at Microsoft.SqlServer.Management.Smo.Agent.Job.ScriptCreate(StringCollection queries, ScriptingPreferences sp)
   at Microsoft.SqlServer.Management.Smo.SqlSmoObject.ScriptCreateInternal(StringCollection query, ScriptingPreferences sp, Boolean skipPropagateScript)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.ScriptCreateObject(Urn urn, ScriptingPreferences sp, ObjectScriptingType& scriptType)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.ScriptCreate(Urn urn, ScriptingPreferences sp, ObjectScriptingType& scriptType)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.ScriptCreateObjects(IEnumerable`1 urns)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.ScriptUrns(List`1 orderedUrns)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.DiscoverOrderScript(IEnumerable`1 urns)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.ScriptWorker(List`1 urns, ISmoScriptWriter writer)
   at Microsoft.SqlServer.Management.Smo.Scripter.ScriptWithListWorker(DependencyCollection depList, SqlSmoObject[] objects, Boolean discoveryRequired)
   at Microsoft.SqlServer.Management.Smo.Scripter.ScriptWithList(DependencyCollection depList, SqlSmoObject[] objects, Boolean discoveryRequired)
   --- End of inner exception stack trace ---
   at Microsoft.SqlServer.Management.Smo.Scripter.ScriptWithList(DependencyCollection depList, SqlSmoObject[] objects, Boolean discoveryRequired)
   at Microsoft.SqlServer.Management.Smo.Scripter.Script(Urn[] urns, SqlSmoObject[] objects)
   at Microsoft.SqlServer.Management.Smo.Scripter.Script(SqlSmoObject[] objects)
   at Microsoft.SqlServer.Management.Smo.Scripter.Script(SqlSmoObject sqlSmoObject)
   at Microsoft.SqlServer.Management.Smo.Agent.Job.Script(ScriptingOptions scriptingOptions)
   at DBADash.SchemaSnapshotDB.SnapshotJobs(DataSet& ds) in D:\a\dba-dash\dba-dash\DBADash\SchemaSnapshotDB.cs:line 203
   at DBADash.DBCollector.ExecuteCollection(CollectionType collectionType) in D:\a\dba-dash\dba-dash\DBADash\DBCollector.cs:line 898
   at DBADash.DBCollector.<>c__DisplayClass53_0.<Collect>b__0(Context context) in D:\a\dba-dash\dba-dash\DBADash\DBCollector.cs:line 508
   at Polly.Policy.<>c__DisplayClass110_0.<Execute>b__0(Context ctx, CancellationToken _)
   at Polly.Policy.<>c__DisplayClass138_0.<Implementation>b__0(Context ctx, CancellationToken token)
   at Polly.Retry.RetryEngine.Implementation[TResult](Func`3 action, Context context, CancellationToken cancellationToken, ExceptionPredicates shouldRetryExceptionPredicates, ResultPredicates`1 shouldRetryResultPredicates, Action`4 onRetry, Int32 permittedRetryCount, IEnumerable`1 sleepDurationsEnumerable, Func`4 sleepDurationProvider)
   at Polly.Retry.RetryPolicy.Implementation[TResult](Func`3 action, Context context, CancellationToken cancellationToken)
   at Polly.Policy.Implementation(Action`2 action, Context context, CancellationToken cancellationToken)
   at Polly.Policy.Execute(Action`2 action, Context context, CancellationToken cancellationToken)
   at DBADash.DBCollector.Collect(CollectionType collectionType) in D:\a\dba-dash\dba-dash\DBADash\DBCollector.cs:line 505

and add again, I also searched into msdb DB the Job with [job_id] = '129EFE82-0209-4678-9B9D-711B2BF34C4A' but I haven't found anything... if the GUID refers to those.

Thanks ALEN

DavidWiseman commented 2 years ago

The error is happening when it attempts to script the job. The latest release (2.23.1) has some improved error handling so if the scripting fails it won't fail the whole job collection.
If you upgrade to the new version it should be able to collect the metadata. You will see an error logged for the job(s) that it failed to script which will include the job name for easier debugging.

If it's a particular job that is causing the issue it might be worth trying to script the job in SSMS and see if you get a similar error. Let me know what you find.

acappelletti commented 2 years ago

Hi David and thanks for your support. I've just install last version as you can see below image

I see a log more details...

System.AggregateException: One or more errors occurred. (Error scripting agent job `129EFE82-0209-4678-9B9D-711B2BF34C4A` on HP-PROBOOK) (Error scripting agent job `21CF0F1A-7930-4CF2-BA54-5921772A7550` on HP-PROBOOK) (Error scripting agent job `79FDDC95-4D86-4B99-8E26-821237E66E28` on HP-PROBOOK) (Error scripting agent job `88CF4444-AC07-4CF3-8E1D-DAB96A443A62` on HP-PROBOOK) (Error scripting agent job `C2917A49-E00D-4EE6-89D9-B1B7307F0E2E` on HP-PROBOOK) (Error scripting agent job `C2F3A4EA-DB5E-4A18-876C-ED7726E8FECF` on HP-PROBOOK) (Error scripting agent job `F3B79DA6-EDDB-4A8F-BB0B-F4CDBED249CA` on HP-PROBOOK)
 ---> System.Exception: Error scripting agent job `129EFE82-0209-4678-9B9D-711B2BF34C4A` on HP-PROBOOK
 ---> Microsoft.SqlServer.Management.Smo.FailedOperationException: Script failed for Job '129EFE82-0209-4678-9B9D-711B2BF34C4A'. 
 ---> System.NullReferenceException: Object reference not set to an instance of an object.
   at Microsoft.SqlServer.Management.Smo.Agent.Job.GetJobScriptingScript(StringBuilder createQuery, ScriptingPreferences sp)
   at Microsoft.SqlServer.Management.Smo.Agent.Job.ScriptCreate(StringCollection queries, ScriptingPreferences sp)
   at Microsoft.SqlServer.Management.Smo.SqlSmoObject.ScriptCreateInternal(StringCollection query, ScriptingPreferences sp, Boolean skipPropagateScript)
   at Microsoft.SqlServer.Management.Smo.ScriptMaker.ScriptCreateObject(Urn urn, ScriptingPreferences sp, 
...............

I try with some query on system table... not only the one below but to all related in some way to JOB execution... I didn't find any record... this is the strange thing. Can you kindly tell me the query contained in the JOB collection step?

Alen

SELECT TOP (1000) *
  FROM [msdb].[dbo].[sysjobs]
  WHERE job_id IN (
N'129EFE82-0209-4678-9B9D-711B2BF34C4A',
N'21CF0F1A-7930-4CF2-BA54-5921772A7550',
N'79FDDC95-4D86-4B99-8E26-821237E66E28',
N'88CF4444-AC07-4CF3-8E1D-DAB96A443A62',
N'C2917A49-E00D-4EE6-89D9-B1B7307F0E2E',
N'C2F3A4EA-DB5E-4A18-876C-ED7726E8FECF',
N'F3B79DA6-EDDB-4A8F-BB0B-F4CDBED249CA'
)
acappelletti commented 2 years ago

Opsss David, I can confirm that now I see the list and the names of the JOBs ... I hadn't checked ... still focusing on the error page. Today I keep it monitored to see if eventually at the next data collection (one hour from now) I will still have some errors. The update fixed job list problem!

Thanks for your support and have a nice day., ALEN

DavidWiseman commented 2 years ago

I was expecting the System.AggregateException. The new behaviour is to log the errors that occur scripting the jobs and continue rather than failing the collection.

This should be listing the job name. Does this query return any results:

SELECT TOP (1000) *
  FROM [msdb].[dbo].[sysjobs]
  WHERE name IN (
N'129EFE82-0209-4678-9B9D-711B2BF34C4A',
N'21CF0F1A-7930-4CF2-BA54-5921772A7550',
N'79FDDC95-4D86-4B99-8E26-821237E66E28',
N'88CF4444-AC07-4CF3-8E1D-DAB96A443A62',
N'C2917A49-E00D-4EE6-89D9-B1B7307F0E2E',
N'C2F3A4EA-DB5E-4A18-876C-ED7726E8FECF',
N'F3B79DA6-EDDB-4A8F-BB0B-F4CDBED249CA'
)

If the jobs are there it would be interesting to try to script them out from SSMS. I'm guessing it might have the same bug.

The Jobs collection is using SMO unlike most of the other collections. So the jobs are enumerated using:

foreach(Microsoft.SqlServer.Management.Smo.Agent.Job job in instance.JobServer.Jobs)

I might change this at some point to just query the system tables and use SMO only for scripting the jobs. It should be more efficient than SMO. I might also add an option to turn off job scripting.

acappelletti commented 2 years ago

Ok David... I found the list now using "name" field (thanks).... and I translate you from Italian description...

"This is a process of a report server process. Any modification of this process could cause database-level incompatibility. To update the process, use Report Manager or Management Studio."

I think could be some ex reporting report or something that sincerely I didn't understood exactly... Anyway I don't see this jobs on my list under SQL SERVER AGENT...

I try also with

EXEC dbo.sp_help_category  
    @type = N'LOCAL' ;  
GO  

for the list of category_id.. ZERO no one 2101

image

Thanks ALEN

DavidWiseman commented 2 years ago

So it looks like a SSRS subscription job, but the category doesn't exists. That might explain why the scripting is failing. That gives me something to test. You could try changing the category for the jobs to see if it fixes the problem.

Note: For the job collection, it's scheduled to run every 1hr, but it will only execute if there have been any changes or if it's been over 24hrs since the last collection. So if you create a new job or modify a job, the change should be picked up within an hour. Some changes like changes to schedules only won't get detected but it will run anyway after 1 day.

acappelletti commented 2 years ago

David... after the update...

image

now see the list into SQL SERVER AGENT...

image

and opening one now I see the contents:

image

So... Ill wait next run... and after I'll give you a feedback. Alen

p.s.: I removed reporting on my local instance many months ago.

DavidWiseman commented 2 years ago

This issue can be re-created by:

I would advise against modifying the system tables directly as you might run into problems like this. If you delete a job category via 'sp_delete_category' it will automatically set any jobs in that category to uncategorized. It also has a few other checks and balances in it. Similar if you need to update a job, it's better to call the SP to do this: EXEC msdb.dbo.sp_update_job @job_name='testjob',@category_name='test'

acappelletti commented 2 years ago

Thanks David for your advise... but I did fast and dirty only in my LAB... on other env surely I'll execute the SP that your post. Now seems all fine!

Thanks Alen....

DavidWiseman commented 2 years ago

Great! 😊