apache / shardingsphere-elasticjob

Distributed scheduled job
Apache License 2.0
8.12k stars 3.28k forks source link

Job doesn't execute after scheduling #1995

Open Wjcccccccccc opened 2 years ago

Wjcccccccccc commented 2 years ago

Bug Report

Create many jobs with Cron */5 * * * * ?

        EventBus.fireEvent(new Event(EventBus.ASYNC_EVENT, "CreateElasticjob: " + System.currentTimeMillis(), (Runnable) () -> {
            new ScheduleJobBootstrap(
                    registryCenter,
                    this,
                    JobConfiguration
                            .newBuilder(jobName, 1)
                            .jobParameter(jobParameter)
                            .overwrite(true)
                            .monitorExecution(true)
                            .failover(true)
                            .jobErrorHandlerType("LOG")
                            .addExtraConfigurations(tracingConfiguration)
                            .cron(cron)
                            .build()
            ).schedule();
        }));

But some jobs schedule normally, some don't schedule after the output of starting up,

 org.quartz.core.QuartzScheduler.start(QuartzScheduler.java:547) - Scheduler pk_status_push_17124_$_NON_CLUSTERED started.

After I open the DEBUG level log, all jobs that don't schedule have printed those logs.

2021-10-09 06:47:31,145 | pk_status_push_17124_Worker-1 |  | DEBUG | org.apache.shardingsphere.elasticjob.reg.exception.RegExceptionHandler.handleException(RegExceptionHandler.java:44) - Elastic job: ignored exception for: KeeperErrorCode = NoNode for /jato-socket_prod/pk_status_push_17124/servers/172.31.43.84

Which version of ElasticJob did you use?

ElasticJob: 3.0.0 curator: 5.1.0

Which project did you use? ElasticJob-Lite or ElasticJob-Cloud?

ElasticJob-Lite

Reason analyze (If you can)

Our test environment doesn't have this issue, but the production environment does. So I doubt there is some effect of the number of jobs?

TeslaCN commented 2 years ago

Hi @Wjcccccccccc For English only. Please translate your issue into English.

Wjcccccccccc commented 2 years ago

Hi @Wjcccccccccc For English only. Please translate your issue into English.

Sorry for that, the issue was edited.

sunkai-cai commented 2 years ago

The registryCenter is miss? Try to check it.

Wjcccccccccc commented 2 years ago

Additional infomation: I start two kinds of jobs A and B in the meantime. A is scheduled every 5 second with cron */5 * * * * ? B is scheduled to shutdown A after 5 minutes with a specified time point by cron 00 05 12 12 12 ?, OneOffJobBootstrap is not suitable here because it needs to trigger by an Event, so I have to use ScheduleJobBootstrap with a specified time point.

now I have a pressure test to start A and B 100 times. sometimes the A and B aren't scheduled, so I find the log

Elastic job: ignored exception for: KeeperErrorCode = NoNode for /jato-web_test/pk_end_2487/sharding/0/instance

I guess it is the slowness of Elasticjob's sharding that impact the job, so the job can't get the instance to execute when the time point is coming.