chezou / tdworkflow

Unofficial Treasure Workflow Client
Apache License 2.0
7 stars 5 forks source link

Proposal to enhance tdworkflow with schedule_from parameter #30

Closed ehaupt closed 6 months ago

ehaupt commented 6 months ago

I noticed that the PUT /api/projects endpoint in digdag allows passing a schedule_from parameter to specify a future time for starting the scheduling of new workflows. This feature could be really useful for planning in advance.

Would you consider adding support for this parameter in the client.create_project() method? I believe it could enhance the functionality of your library, and many users including myself would find it extremely beneficial.

Thank you for considering this addition.

chezou commented 6 months ago

Added on https://github.com/chezou/tdworkflow/pull/31. Can you try it before release?

ehaupt commented 6 months ago

Awesome, thank you! I'll report back.

ehaupt commented 6 months ago

Test scenario

import os
from datetime import datetime, timedelta

import requests
import tdworkflow
import yaml

session = requests.Session()
client = tdworkflow.client.Client(
    endpoint="localhost:65432", apikey="", _session=session, scheme="http"
)

# current time
current_time = datetime.now()

# sample workflow
workflow = {
    "timezone": "Europe/Zurich",
    "schedule": {
        "daily>": f"{current_time.hour + 1}:{current_time.minute}:{current_time.second}"
    },
    "+Hello": {"echo>": "Hello"},
}

# mkdir work
def mkproject(project_name, workflow):
    project_dir = f"work/{project_name}"
    os.makedirs(project_dir, exist_ok=True)
    with open(f"{project_dir}/{project_name}.dig", "w") as f:
        f.write(yaml.dump(workflow))
    return project_dir

# normal
project_name = "normal"
project_update = client.create_project(
    project_name=project_name, target_dir=mkproject(project_name, workflow)
)

# schedule_from
time_2_hours_from_now = current_time + timedelta(hours=2)
project_name = "schedulefrom"
project_update = client.create_project(
    project_name=project_name,
    target_dir=mkproject(project_name, workflow),
    schedule_from=time_2_hours_from_now,
)

Expected behavior

Result

Normal

image

Schedulefrom

image

Conclusion

The schedule_from parameter implementation works as expected.

I noticed that you've also implemented two undocumented (but existing) parameters:

        :param clear_schedules: Clear schedules for the given workflow names
        :param clear_schedule_all: Clear all schedules

Having those available will come in handy. Thank you very very much.

ehaupt commented 6 months ago

Do you happen to know what setting clear_schedule_all actually does? I've created a project with a schedule, then created the same project again with a different schedule but this time set clear_schedule_all to True while watching:

digdag schedules

All I've observed was that the next scheduled to run at field changed to the correct new time I've set.

chezou commented 6 months ago

I don't know much about it, but it looks like it affects handling around the last_session_time. My notation in the docstring may be wrong. https://github.com/treasure-data/digdag/pull/1800/files#diff-2f645f181008bcd7e23844c3fd9b4a0f8d07f6430ba63aefdcb7c50f12329cfeR505

chezou commented 6 months ago

I've updated the docstring accordingly. 7e0ba331688100226d9a2a21f884b21755e28cc0

Looking at here https://github.com/treasure-data/digdag/pull/1800/files#diff-afc0d65ad308597007aeb580d7e11432a12ff7080cc795d2e7306f8f931b03dbR133-R141 and here https://github.com/treasure-data/digdag/pull/1800/files#diff-2f645f181008bcd7e23844c3fd9b4a0f8d07f6430ba63aefdcb7c50f12329cfeR547-R570 the clear parameters enforce to forget last_session_time from schedules.

I think it's good time to close this issue. Will release a new version soon. Thanks for your evaluation @ehaupt!

chezou commented 6 months ago

Released https://pypi.org/project/tdworkflow/0.9.0/

ehaupt commented 6 months ago

Thank you for the explanation and the new release.