dbader / schedule

Python job scheduling for humans.
https://schedule.readthedocs.io/
MIT License
11.71k stars 960 forks source link

"at" with timezone parameter cause strange behavior #532

Closed scku208 closed 11 months ago

scku208 commented 2 years ago
import time
import schedule
import threading

def job():
    print("I'm running on thread %s" % threading.current_thread())

def run_threaded(job_func, *args, **kwargs):
    job_thread = threading.Thread(target=job_func, args=args, kwargs=kwargs)
    job_thread.start()

# run once at 19:34:59 UTC
schedule.every().day.at("19:34:59").do(run_threaded, job)

# also run once at 03:34:59 UTC+8, but repeatly run job each "1 second"
schedule.every().day.at("03:34:59", 'Asia/Taipei').do(run_threaded, job)

# also has strange behavior
# schedule.every(1).minute.at(":10", 'Asia/Taipei').do(run_threaded, job)

# it works
# schedule.every(1).minute.at(":10", 'UTC').do(run_threaded, job)

while 1:
    schedule.run_pending()
    time.sleep(1)
InfernalAzazel commented 2 years ago

mine too

mew905 commented 1 year ago

You need to be more specific with "strange behavior".

I believe I encountered it as well, trying to run a script at 12:15am UTC and 12:15pm UTC. I can't find any reason for it, no errors or exceptions, the application seems to believe it is running as intended, however with the Timezone set, the application runs it every second (every refresh of the time.sleep in the run_pending loop). I can work around it by using an If statement to verify runtime, but at that point I may as well just run it in an infinite loop and do my own checks.

scku208 commented 1 year ago

You need to be more specific with "strange behavior".

I believe I encountered it as well, trying to run a script at 12:15am UTC and 12:15pm UTC. I can't find any reason for it, no errors or exceptions, the application seems to believe it is running as intended, however with the Timezone set, the application runs it every second (every refresh of the time.sleep in the run_pending loop). I can work around it by using an If statement to verify runtime, but at that point I may as well just run it in an infinite loop and do my own checks.

Thanks for more specific description. For now I just report this behavior and transform time zone from UTC to UTC+8 by myself.

mew905 commented 1 year ago

honestly unless DST is a consideration, UTC time is probably the best option to use. Python's time.time() function returns seconds since epoch (Midnight Jan 1, 1970 UTC). Convert it to int (as it also reports microseconds for some reason) and just modulo it for the frequency you want.

For example, I needed something to run at 12:15am and 12:15pm UTC, so being 12:15am is 15 minutes past midnight, so I launched it in a separate thread and I did: if (int(time.time())-900) % 43200 = 0. the 900 is 15 minutes in seconds, modulo 43200 is 12 hours in seconds. Every time modulo reports 0 (a clean division of 12 hours), it runs. I then just set it in a loop that repeats every 0.9 seconds. There's a small chance it'll run twice in a second but honestly I'm just zeroing values so it's not a big deal for me.

It's unfortunate such an essential function also breaks essential functionality (.... running as scheduled in the first place), but considering the last update appears to be 10 months ago, it seems a lost cause.

dshetyo commented 1 year ago

I am facing similar issue https://github.com/dbader/schedule/issues/592

takumi2786 commented 1 year ago

This function is not good.

https://github.com/dbader/schedule/blob/a3c2074dcd8f39168525f0fc2b009bc7119f0796/schedule/__init__.py#L702

takumi2786 commented 1 year ago

This works well in my situation.

def _schedule_next_run(self) -> None:
    """
    Compute the instant when this job should run next.
    """
    if self.unit not in ("seconds", "minutes", "hours", "days", "weeks"):
        raise ScheduleValueError(
            "Invalid unit (valid units are `seconds`, `minutes`, `hours`, "
            "`days`, and `weeks`)"
        )

    if self.latest is not None:
        if not (self.latest >= self.interval):
            raise ScheduleError("`latest` is greater than `interval`")
        interval = random.randint(self.interval, self.latest)
    else:
        interval = self.interval

    self.period = datetime.timedelta(**{self.unit: interval})

    self.next_run = datetime.datetime.now() + self.period
    if self.start_day is not None:
        if self.unit != "weeks":
            raise ScheduleValueError("`unit` should be 'weeks'")
        weekdays = (
            "monday",
            "tuesday",
            "wednesday",
            "thursday",
            "friday",
            "saturday",
            "sunday",
        )
        if self.start_day not in weekdays:
            raise ScheduleValueError(
                "Invalid start day (valid start days are {})".format(weekdays)
            )
        weekday = weekdays.index(self.start_day)
        days_ahead = weekday - self.next_run.weekday()
        if days_ahead <= 0:  # Target day already happened this week
            days_ahead += 7
        self.next_run += datetime.timedelta(days_ahead) - self.period
    if self.at_time is not None:
        if self.unit not in ("days", "hours", "minutes") and self.start_day is None:
            raise ScheduleValueError("Invalid unit without specifying start day")
        kwargs = {"second": self.at_time.second, "microsecond": 0}
        if self.unit == "days" or self.start_day is not None:
            kwargs["hour"] = self.at_time.hour
        if self.unit in ["days", "hours"] or self.start_day is not None:
            kwargs["minute"] = self.at_time.minute

        if self.at_time_zone is None:
            self.next_run = self.next_run.replace(**kwargs)  # type: ignore
        else:
            # ↓↓↓ Once converted to the specified time zone before replace ↓↓↓
            self.next_run = self.next_run.astimezone(tz=self.at_time_zone)
            self.next_run = self.next_run.replace(**kwargs)
            self.next_run = self.next_run.astimezone().replace(tzinfo=None)
            # ↑↑↑

        # Make sure we run at the specified time *today* (or *this hour*)
        # as well. This accounts for when a job takes so long it finished
        # in the next period.
        if not self.last_run or (self.next_run - self.last_run) > self.period:
            if self.at_time_zone is None:
                now = datetime.datetime.now()
            else:
                # ↓↓↓ Obtain current time with timezone ↓↓↓
                now = datetime.datetime.now(tz=self.at_time_zone)
                # ↑↑↑

            if (
                self.unit == "days"
                and self.at_time > now.time()
                and self.interval == 1
            ):
                self.next_run = self.next_run - datetime.timedelta(days=1)
            elif self.unit == "hours" and (
                self.at_time.minute > now.minute
                or (
                    self.at_time.minute == now.minute
                    and self.at_time.second > now.second
                )
            ):
                self.next_run = self.next_run - datetime.timedelta(hours=1)
            elif self.unit == "minutes" and self.at_time.second > now.second:
                self.next_run = self.next_run - datetime.timedelta(minutes=1)
    if self.start_day is not None and self.at_time is not None:
        # Let's see if we will still make that time we specified today
        if (self.next_run - datetime.datetime.now()).days >= 7:
            self.next_run -= self.period

my_task called once a day

import schedule
import time
def my_task():
    print("This is a scheduled task")

schedule.every(1).day.at("03:52", tz='Asia/Tokyo').do(my_task)

while True:
    schedule.run_pending()
    time.sleep(1)
SijmenHuizenga commented 11 months ago

Thank you everyone for reporting and investigating. The timezone bug described here has now been resolved with #583 and released in 1.2.1.

Closing this for now. Feel free to reach out if you have any more issues or concerns. 😄