NOAA-GFDL / fre-cli

Python-based command line interface for FRE (FMS Runtime Environment) to compile and run FMS-based models and post-process their output.
GNU Lesser General Public License v3.0
3 stars 11 forks source link

Add option for overwriting/cloning over old experiments and better error messages #44

Open cwhitlock-NOAA opened 9 months ago

cwhitlock-NOAA commented 9 months ago

--overwrite option for checkoutSscript.py that allows for removing and re-checking-out over a preexisting experiment. Cleans up prior directories via cylc clean if needed, gives a warning message with a pause when overwriting experiments, and provides messages about the process.

ceblanton commented 9 months ago

This is great, thank you. There are two sorts of overwriting here, though.

There's the ~/cylc-src/NAME location, which is the workflow template checkout.

If that location exists, what does overwrite do? It seems a little destructive to do a "rm -rf" on the user's behalf. You could also consider renaming it instead to ~/cylc-src/NAME.old.blah, where blah might be some timestamp. (Bronx fremake renames the old directory as opposed to deleting it when --force-checkout is used).

Then there's the ~/cylc-run/NAME location, which is the installed workflow.

If that location exists, then yes, I think a cylc clean NAME on the users's behalf is what is needed.

cwhitlock-NOAA commented 9 months ago

If that location exists, what does overwrite do? It seems a little destructive to do a "rm -rf" on the user's behalf. You could also consider renaming it instead to ~/cylc-src/NAME.old.blah, where blah might be some timestamp. (Bronx fremake renames the old directory as opposed to deleting it when --force-checkout is used).

Good point! I'm used to thinking of that as two different behaviors - a "move" option, to keep the old experiment and files under a new (pre-determined) name, and an "erase"/"overwrite" option that deletes the prior experiment. The rm-rf is the erase/overwrite and the renaming is the move option. I'm biased towards keeping both options, but if --force-checkout corresponds to a move-like behavior in bronx fremake, it makes sense to keep that connection and keep the more destructive rm -rf as a different option (--force-clean? --force-rerun?).

On Fri, Mar 1, 2024 at 5:35 PM Chris Blanton @.***> wrote:

This is great, thank you. There are two sorts of overwriting here, though.

There's the ~/cylc-src/NAME location, which is the workflow template checkout.

If that location exists, what does overwrite do? It seems a little destructive to do a "rm -rf" on the user's behalf. You could also consider renaming it instead to ~/cylc-src/NAME.old.blah, where blah might be some timestamp. (Bronx fremake renames the old directory as opposed to deleting it when --force-checkout is used).

Then there's the ~/cylc-run/NAME location, which is the installed workflow.

If that location exists, then yes, I think a cylc clean NAME on the users's behalf is what is needed.

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GFDL/fre-cli/issues/44#issuecomment-1974026191, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC362WS5LPCSTRVPK3KQVNTYWD7EFAVCNFSM6AAAAABECPRBRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZUGAZDMMJZGE . You are receiving this because you authored the thread.Message ID: @.***>

-- Carolyn Whitlock

bcc2761 commented 9 months ago

I second this proposal with two options. I think the erase/overwrite option should definitely be implemented, but I'm wondering if we give users the ability to simply just rename the old one (how --force-checkout in Bronx works right now), that users might use this command more than once and it would then have the same issue again where the pre-determined 'old experiment run' name would now have the 'already exists' error message. Do we want to allow them to move/rename multiple older experiment runs of the same name? This would be done with a counter of some sort, or else it would probably be better to not give them this option at all. Thoughts?

ceblanton commented 9 months ago

It's a little confusing to have the "workflow definition" configuration in ~/cylc-src/NAME and the "installed workflow" in ~/cylc-run/NAME, but we need to help make this as understandable as possible.

We could start with no extra options, but try to document the usages anyway. i.e.

If ~/cylc-src/NAME exists, the message could be:

"The workflow definition specified by -e/-p/-t already exists at the location ~/cylc-src/NAME. If you would like to recheck-out a fresh workflow definition, move or remove the ~/cylc-src/NAME location, and run this command again."

cwhitlock-NOAA commented 9 months ago

I'm wondering if we give users the ability to simply just rename the old one (how --force-checkout in Bronx works right now), that users might use this command more than once and it would then have the same issue again where the pre-determined 'old experiment run' name would now have the 'already exists' error message. Do we want to allow them to move/rename multiple older experiment runs of the same name? This would be done with a counter of some sort, or else it would probably be better to not give them this option at all. Thoughts?

You're on track with the counter - the way that I've seen that dealt with is to number the moved experiments sequentially - either by numbering the moves (experiment moves to experiment-001, then re-running moves to experiment-002) or by date-and-timestamps (experiment moved to experiment-today's date, re-running moves to experiment-later-date)

On Mon, Mar 4, 2024 at 11:42 AM Bennett Chang @.***> wrote:

I second this proposal with two options. I think the erase/overwrite option should definitely be implemented, but I'm wondering if we give users the ability to simply just rename the old one (how --force-checkout in Bronx works right now), that users might use this command more than once and it would then have the same issue again where the pre-determined 'old experiment run' name would now have the 'already exists' error message. Do we want to allow them to move/rename multiple older experiment runs of the same name? This would be done with a counter of some sort, or else it would probably be better to not give them this option at all. Thoughts?

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GFDL/fre-cli/issues/44#issuecomment-1977016336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC362WUKF2A4GEAD7AVT3LLYWSP7PAVCNFSM6AAAAABECPRBRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZXGAYTMMZTGY . You are receiving this because you authored the thread.Message ID: @.***>

-- Carolyn Whitlock

cwhitlock-NOAA commented 9 months ago

Of course, the problem with the sequential counter is that you need to move all of the files generated by the experiment as well - which is less of an issue for the files in cylc-run/experiment, but also should probably apply to any files in /archive (and /ptmp for diagnostic work on the developer side?).

On Mon, Mar 4, 2024 at 12:02 PM Carolyn Whitlock - NOAA Affiliate < @.***> wrote:

I'm wondering if we give users the ability to simply just rename the old

one (how --force-checkout in Bronx works right now), that users might use this command more than once and it would then have the same issue again where the pre-determined 'old experiment run' name would now have the 'already exists' error message. Do we want to allow them to move/rename multiple older experiment runs of the same name? This would be done with a counter of some sort, or else it would probably be better to not give them this option at all. Thoughts?

You're on track with the counter - the way that I've seen that dealt with is to number the moved experiments sequentially - either by numbering the moves (experiment moves to experiment-001, then re-running moves to experiment-002) or by date-and-timestamps (experiment moved to experiment-today's date, re-running moves to experiment-later-date)

On Mon, Mar 4, 2024 at 11:42 AM Bennett Chang @.***> wrote:

I second this proposal with two options. I think the erase/overwrite option should definitely be implemented, but I'm wondering if we give users the ability to simply just rename the old one (how --force-checkout in Bronx works right now), that users might use this command more than once and it would then have the same issue again where the pre-determined 'old experiment run' name would now have the 'already exists' error message. Do we want to allow them to move/rename multiple older experiment runs of the same name? This would be done with a counter of some sort, or else it would probably be better to not give them this option at all. Thoughts?

— Reply to this email directly, view it on GitHub https://github.com/NOAA-GFDL/fre-cli/issues/44#issuecomment-1977016336, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC362WUKF2A4GEAD7AVT3LLYWSP7PAVCNFSM6AAAAABECPRBRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNZXGAYTMMZTGY . You are receiving this because you authored the thread.Message ID: @.***>

-- Carolyn Whitlock

-- Carolyn Whitlock

bcc2761 commented 9 months ago

How does Bronx deal with this right now? Does this ever happen? @ceblanton

cwhitlock-NOAA commented 8 months ago

As of March 8, chat with @ceblanton we seem to have the following consensus: