DeltaRCM / pyDeltaRCM

Delta model with a reduced-complexity approach
https://deltarcm.org/pyDeltaRCM/
MIT License
18 stars 10 forks source link

add longer tests to the test_consistent file #51

Open amoodie opened 4 years ago

amoodie commented 4 years ago

Longer tests will help make sure any cases are covered by consistency checks.

Use a few different model configurations and random seeds in different checks to increase redundancy in the consistency checking.

Original suggestion here

amoodie commented 4 years ago

Per discussion on #54:

Or were you thinking like a long, serious, model run? The latter could be configured as another "job" on travis, to call some python script that runs the model for a few hundred timesteps (jobs time out at some point...).

I just can't wrap my head around how we would go about testing the ability of the model (as we continue to alter it) to transition from the initial jet to a reasonable channel network. This (at least to me) is one of the critical things pyDeltaRCM has to do and there is no obvious or intuitive way to test that without a model run of many timesteps.

I don't know how we get around it, but that is my concern because if we break the model and it no longer forms a channel network then we have to revert to an older version of the code with no real knowledge of where the 'breaking' change was made.

Maybe initializing tests from various pre-developed topographies will help get around running super long tests, not sure just an early thought.

amoodie commented 4 years ago

Hey @elbeejay

I had a thought about this and wanted to run it by you. With Github Actions, we can schedule a job to run at a specified time. Let's say weekly at 1am EST on Saturday, when loads would otherwise be low. The job times out at 6 hours.

We could schedule a job to run a model run with some config known to produce a "good" channel network, and let the job run weekly. When the job finishes (we should target a 4-5 hour run time) we can use this actions call to post the resulting "eta" map to a slack channel, and upload the artifacts (images, logs, etc) to check them out if there are any problems.

I'm pretty sure this is fine and in line with Actions terms of service, but I would double check before doing anything. Thoughts?

elbeejay commented 4 years ago

That sounds pretty neat. I guess the ultimate goal would be to tie in a set of tests and accomplish the longer run validation this way? With the aim of eventually moving away from a manual weekly check of the results and a more automated approach (maybe only running the long run if there was a push that week or something)?

It seems like a good find and would give us a way to do these longer tests. The Actions documentation and terms of use seem pretty open about the types of workflows etc you run so long as you're under the usage limits. so I too am thinking this would be okay.

amoodie commented 4 years ago

Yeah, I wasn't really even thinking about explicit tests, but just thought it would be better than nothing to get a notification once a week with an image of a delta run from the latest build on develop. It would at least help us catch issues in a reasonable time frame.

That said, I guess there's really no reason we can't run a 4 hour job each time someone opens a PR, and somehow automate posting a comment to the PR with the delta...

I checked on the ToS, and I think this use would be fine, because it is part of our "testing".

Anyway, just some thoughts, I'd like to get to this eventually but it's probably not high priority.