dagster-io / dagster

An orchestration platform for the development, production, and observation of data assets.
https://dagster.io
Apache License 2.0
11.3k stars 1.43k forks source link

Proper business day partition support #17464

Open sheldonrong opened 11 months ago

sheldonrong commented 11 months ago

What's the use case?

Support business day partitions from Monday to Friday.

What is the current issue: There is a workaround solution floating around the Internet that uses TimeWindowPartitions with the setup of "0 0 1-5", the problem with this approach is that the Friday's partition is extended to Sunday, which means the users won't see the Friday's partition until Monday hits and thus the Friday's partition cannot be materialized over the weekend.

See the screenshot posted below that captured one of our asset over the weekend, our partition definition is defined as

AWESOME_PARTITION = TimeWindowPartitionsDefinition(
    start=datetime(2016, 1, 5), fmt="%Y-%m-%d", cron_schedule="0 0 * * 1-5", timezone="Etc/GMT-2"
)

and over the weekend, the screenshot for one of our asset is shown below image

As you might see, 2023-10-27 which is Friday is not available in the list.

Ideas of implementation

No response

Additional information

Looking at the source code, the reason Friday is not showing up is because the Friday's partition is labled as

[Friday - Sunday], and given Sunday's isn't over, Friday's partition is not available to dagster. Atm, I have to do a ad-hoc patch in the TimeWindowPartitions's get_current_time function to bypass this issue, but obviously, that is not nice.

def get_current_timestamp(self, current_time: Optional[datetime] = None) -> float:
    dt = (
        pendulum.instance(current_time, tz=self.timezone)
        if current_time
        else pendulum.now(self.timezone)
    )
    if dt.weekday() in (5, 6):  # adhoc patch for BusinessDay partition only, not applicable to other TimeWindowPartitions
         dt = dt + timedelta(days=7 - dt.weekday())
    return dt.timestamp()

Would like to hear your thoughts and what's the best way of fixing this.

Message from the maintainers

Impacted by this issue? Give it a 👍! We factor engagement into prioritization.

tacastillo commented 11 months ago

Hi Sheldon! Thanks for raising this issue. Agreed, having to patch into it is not a good solution. We've shared this with the team and will triage it.