earthcube / scheduler

Scheduling approaches related to gleaner tooling
Apache License 2.0
0 stars 0 forks source link

Revise pygen.py cron code #65

Closed fils closed 1 month ago

fils commented 11 months ago

I think the logic here for trying to do a default distribution of the sources is wrong.

I want to say given 21 days and 42 source then a source would run ever 12 hours.

around line 23-24 I do

        hours = int(days) * 24  # days is the cli param for the number of days to work over
        inc = round(hours / len(c["sources"])) # divide hours we want to run over by number of source to get increment

then around line 52 I do (where days is from the command line)

                    di = int(days)
                    q = (((i * inc) / 24) % di) // 1
                    r = (i * inc) % 24
                    new_cron_schedule = "0 {} {} * *".format(r, int(q)+1)

trying to get this aling with cron logic. This is now hacked a bit to make it work for the large number of sources in IoW and I hope it doesn't effect ECO.

I'll try and get this resolved to something working and logical soon but comments or better solutions welcome.

valentinedwv commented 11 months ago

I'm seeing that first 7 days of the month. That's fine for now. We can refine later

valentinedwv commented 11 months ago

Rather than doing days, maybe say parser.add_argument("-w", "--weekly", help="Spread the run over a week") parser.add_argument("-m", "--monthly", help="Spread the run over a month")

And if we have only 4 runs at a time, can we just say: @weekly or @monthly https://docs.dagster.io/concepts/partitions-schedules-sensors/schedules#basic-schedules

valentinedwv commented 1 month ago

Will no longer be needed with changes to run dynamically