Closed ghukill closed 6 years ago
Traced back to why slower when Mongo I/O is high, checking for validation results for each Job, which pings Mongo's Record
collection. From Job.get_lineage()
:
# get validation results for self #
validation_results = self.validation_results()
Which is solely to determine if the Job is valid or not, to colorize the edges of the node. Is there another way to determine this? How does Jobs DT table infer validity for a Job?
Even Jobs DT table uses the method validation_results()
when looping through Jobs.
If not the only cause of the lineage slowdown, it certainly contributes. And, appears to be run twice each time a Record Group page is shown, for lineage and Jobs DT table.
Would probably make sense to save this calculation to Job.job_details
at the tail end of a Job running, and then update if validations run or removed.
Fixed. Storing validation results to validation_results
, and detailed record count to detailed_record_count
in Job.job_details
.
Methods in question:
get_all_jobs_lineage()
get_job_lineage()
This is very noticeable when viewing Jobs table with all Jobs, or attempting to retrieve the lineage for a Job while the system is working hard.