GreenScheduler / cats

CATS: the Climate-Aware Task Scheduler :cat2: :tiger2: :leopard:
https://greenscheduler.github.io/cats/
MIT License
52 stars 9 forks source link

Documented command to apply with `at` scheduler broken #33

Closed sadielbartholomew closed 1 year ago

sadielbartholomew commented 1 year ago

HI, a quick one to note that the README command quoted to utilise the output of cats with the at scheduling command, namely:

command | at `python -m cats -d --loc

is not working. Clearly there is a missing or spurious backtick in there, but eve when one is added in a logical place, namely "\<trivial job command to test> | at `python -m cats -d \<job_duration> --loc \<postcode>`" (or I even try it without, just to test to be sure, logically that wouldn't work to my knowledge though), the command won't work, e.g:

$ mkdir mydir | at `python -m cats -d 120 --loc "RG6 6ES"`
{'timestamp': datetime.datetime(2023, 5, 13, 12, 30), 'carbon_intensity': 102.0, 'est_total_carbon': 102.0}
syntax error. Last token seen: B
Garbled time

I am not sure of the context, but imagine this command might have worked before the latest changes to include the estimated carbon intensity, because obviously with the output now being a Python-like dict it is not a valid at timestamp input (unless processed with further commands e.g. by pipe to extract it), whereas if it was just the timestamp before that could have worked assuming the extra backtick.

Solution for now

Until the CLI input and output format is tidied and we can provide a means to grab the timestamp only, to pass to at etc., we could either:

abhidg commented 1 year ago

It is not an issue with the dictionary being displayed as that is output to stderr, and at should grab the timestamp directly. There is a Best job start time printed that might cause an issue (maybe that is the last token seen 'B' error), but even after removing that, it does not work on my Mac. It appears that macOS and Linux have different default formats for at time specification (@andreww ran into this as well).

Debian accepts YY-MM-DD (https://manpages.debian.org/bullseye/at/at.1.en.html)

You can also say what day the job will be run, by giving a date in the form month-name day with an optional year, or giving a date of the form MMDD[CC]YY, MM/DD/[CC]YY, DD.MM.[CC]YY or [CC]YY-MM-DD.

macOS does not (https://ss64.com/osx/at.html)

The day on which the job is to be run may also be specified by giving a date in the form month-name day with an optional year, or giving a date of the forms DD.MM.YYYY, DD.MM.YY, MM/DD/YYYY, MM/DD/YY, MMDDYYYY, or MMDDYY.

I am not sure how the implementations distinguish between ambiguous MM/DD and DD/MM, we should probably switch to using the -t flag which is specified in POSIX and should be identical in both:

-t: Specify the job time using the POSIX time format. The argument should be in the form [[CC]YY]MMDDhhmm[.SS]

sadielbartholomew commented 1 year ago

Ah, I see. Thanks for the clarifications, Abhishek! I'm on Linux Mint (an Ubuntu variant), if that's useful to know.

Glad to see you're working on getting the command to work, too. I'll try to review the PR tomorrow.

colinsauze commented 1 year ago

@sadielbartholomew what does the output look like if you run cats without substituting it into at? e.g. just doing python -m cats -d 120 --loc "RG6 6ES"

You should get something like:

13:30 May 12 2023

I'd chosen this format as At on my system seemed to accept it and it needed to have a time and not just a date. At (at least on Linux, I need to check the BSD/Mac manpage) also supports being given a relative start time, so we could get cats to calculate the relative start time instead of an absolute one.

sadielbartholomew commented 1 year ago

Hi Colin, with regards to:

what does the output look like if you run cats without substituting it into at?

I get:

$ python -m cats -d 120 --loc "RG6 6ES"
{'timestamp': datetime.datetime(2023, 5, 12, 14, 30), 'carbon_intensity': 72.0, 'est_total_carbon': 72.0}
Best job start time: 2023-05-12 14:30:00

so that works well enough. Based on the evidence in this thread, especially the message syntax error. Last token seen: B, is it not the "Best job start time: " context string that is confusing the at scheduler, when it expects the datetime input immediately? I am not sure if that is something new that has been introduced since the hack day (I don't have time to check the code or git blame, etc.).

andreww commented 1 year ago

Yes - that 'Best job start time' was messing things up for me yesterday, but I hadn't given it much thought (I just blindly applied the change which is now in #35, and held my breath).

Anyway, looking now (OSX 13.3), this time format is a mess!

echo 'hi' | at -t 2023-05-13 16:00 schedules a job for Sat Oct 2 23:13:00 2021 echo 'hi' | at -t 05131600 schedules a job for Sat May 13 16:00:00 2023 echo 'hi' | at 05131600 gives at: garbled time echo 'hi' | at 16:00 schedules a job for Sat May 13 16:00:00 2023 (tomorrow, it's 16:20 now) echo 'hi' | at 1600 schedules a job for Sat May 13 16:00:00 2023 (tomorrow, it's 16:20 now) echo 'hi' | at 1600 13.05.23 schedules a job for Sat May 13 16:00:00 2023 echo 'hi' | at 1600 05/13/23 schedules a job for Sat May 13 16:00:00 2023 (note the month comes first) echo 'hi' | at -t 1600 05/13/23 gives out of range or illegal time specification: [[CC]YY]MMDDhhmm[.SS]

I think I need to lie down in a dark room.

That [[CC]YY]MMDDhhmm[.SS] format is supposed to be the POSIX time standard, so I think we should probably use that. What I'm not clear about is if -t (or, indeed the at command) is in the POSIX standard, or if we need to choose how we call at depending on the OS...

colinsauze commented 1 year ago

either I didn't push something or a subsequent commit has overwritten it, but I had it just outputting the time when we did the demo.

sadielbartholomew commented 1 year ago

On this note, can we have a command-line argument to toggle between a 'verbose' output, such as the Best job start time: <datetime> and the basic datetime-only format of <datetime> that is suitable to pipe to at or other scheduling tools?

I am not sure which is best for the default, but we could, say, have a -v and equivalent --verbose option if the latter is the default, or a -o or -m or similar option to indicate datetime only (-d is already taken I guess, so o for only or m for minimal, unless we decide to revamp the CLI structure).

colinsauze commented 1 year ago

On this note, can we have a command-line argument to toggle between a 'verbose' output

are you suggesting that -v would enable/disable what we currently put on stderr or that this output would go to stdout when verbose is enabled?

colinsauze commented 1 year ago

either I didn't push something or a subsequent commit has overwritten it, but I had it just outputting the time when we did the demo.

@abhidg had half reintroduced my fix in pr #35 but I get some odd behaviour with the format he put in, see the pull request for more details.