hubverse-org / hubValidations

Testing framework for hubverse hub validations
https://hubverse-org.github.io/hubValidations/
Other
1 stars 3 forks source link

Add finer control of file mods/deletion checks. Resolves #65 #66

Closed annakrystalli closed 6 months ago

annakrystalli commented 6 months ago

Hey @LucieContamin & @elray1 !

I've taken all your comments on board and introduced two new arguments tovalidate_pr() for controlling modification/deletions checks performed on model output and model metadata files.

They main issue I hit that ended up in some more time and complicated code than I hoped for is summarised as a note in the function docs too:

Note that to establish relative submission windows when performing modification/deletion checks and allow_submit_window_mods is TRUE, the reference date is taken as the round_id extracted from the file path (i.e. submit_window_ref_date_from is always set to "file_path"). This is because we cannot extract dates from columns of deleted files. If hub submission window reference dates do not match round IDs in file paths, currently allow_submit_window_mods will not work correctly and is best set to FALSE. This only relates to hubs/rounds where submission windows are determined relative to a reference date and not when explicit submission window start and end dates are provided in the config.

Up to now and by default in ths PR, relative submission windows have been determined using contents of the file (which follows how the config files are set up to give the reference column name relative to which the window is determined). So far this date always matched the round ID in the file path so I opted for the option above to get round the problem described. This does mean that two submission checks (one when we check whether the submission falls within the submission window as part of validate_submission (which is also used separately, locally by users) and once now when allow_submit_window_mods = TRUE and mods/deletions are detected) use two different sources of reference date to calculate the window (the former uses the file contents to follow how config is structured more faithfully and allow more flexibility while the later the file path to allow for checking deleted file windows).

I obviously don't love this duplication and difference in ref date source but I dislike the thought of introducing irrelevant arguments to validate_submission in order to pass upstream arguments from validate_pr as well introducing functionality to handle deleted files in validate_submission more.

I feel the separation of the task of checking and alerting about mods & deletions to validate_pr() is correct and I've tried to handle the idiosyncrasies of checking for submission windows on deleted files as best I could, especially through documentation but if you have a better suggestion or can envisage a big problem I've overlooked let me know!

gioco commented 6 months ago

Hi, we are experiencing an issue related to this. Contributing teams shall be allowed to modify their submission while within the submission window. Is there a way to allow it with the former version of the code (the one in the main branch)? If not, can you provide an estimate about when the PR will be merged and the allow_submit_window_mods parameter enabled? Thank you very much!

annakrystalli commented 6 months ago

Hi @gioco ! Hopefully this will be merged within the next couple of days.

Just to clarify, the issue your team's are having is they are submitting files, the files are getting merged and they are committing modified versions after that but within the submission window? And you want that to be allowed?

That is not possible with the current version of the code (it is quite strict) but will be with this PR. You can just ignore those errors for the time being though.

gioco commented 6 months ago

Hi Anna, yes exactly. We have a workflow that automatically merges the submission if all validations are passed. A team submitted forecasts yesterday, and today wants to update them. Submission window closes tomorrow, and we want updates to be allowed while the submission window is still "open". Thank you very much for your prompt reply.

annakrystalli commented 6 months ago

So yes, sadly, checks will fail currently so might need to do some manual merging just for a couple of days.

You could also see whether you could briefly set your Github Action to install the hubValidations package from this branch instead of the main branch and change it back when the PR is merged. A little risky though as this PR hasn't been reviewed yet.

gioco commented 6 months ago

Will go with the manual merges for this couple of days! Thanks!

LucieContamin commented 6 months ago

I like this new functionality, we often also have updates during a "round" of submission. It seems to be all good, the function documentation is clear. Thanks @annakrystalli !

annakrystalli commented 6 months ago
annakrystalli commented 6 months ago

Hello again @elray1 !

As we're planning to push this through quickly, I spent some time thinking ahead to exceptions and refactored validate_pr and internal functions to be more robust, specifically by handling errors in calculating submission windows at the individual file level, so that it doesn't cause an execution error at validate_pr level. I also added more better/tests.

I made some other minor improvements and added more comments so hopefully, between that and the refactoring, it should be much easier to follow what's going on.

If you could take a quick look at just this file: https://github.com/Infectious-Disease-Modeling-Hubs/hubValidations/blob/474f8b66c661696945310535dbb542ec6fa7437c/R/validate_pr.R

I'd really appreciate it. Then we can go ahead and merge in! 🎉