streamsets / tutorials

StreamSets Tutorials
Apache License 2.0
348 stars 192 forks source link

Update job label based on job name using sdk - slow #130

Open wshamim1 opened 3 years ago

wshamim1 commented 3 years ago

I am trying to update labels based on job name. right now I have 19k jobs and I notice that it takes around 5 to 8 mins to update labels for 2 jobs. is this expected? where as when I try to update the label using tag_name, its pretty quick.

Secondary, how can I filter on the jobs. say I just want to update all the jobs labels where job name like 'some name'?

kirtiv1 commented 3 years ago

Hi Wilson,

1)Depending on which criterion you are using to get job to update, will decide the time. If you have 19k jobs, you are right, it will be significant time difference. My guess is your former attempt is trying to fetch all jobs, whereas later approach is filtering and so getting only the jobs needed and so is much faster.

2)Secondary, how can I filter on the jobs. say I just want to update all the jobs labels where job name like 'some name'?

--> Here is a good tutorial to show the same ways-to-fetch-jobs

wshamim1 commented 3 years ago

Thanks Kriti!

regarding the second use case I want to get jobs with some name like "test" instead of giving the full job name. so this way I can fetch all the job having test. I dont see it here "ways-to-fetch-jobs." right now what I am fetching all the jobs then applying the filter which does work but again as said its really takes longer time.