hubverse-org / hubValidations

Testing framework for hubverse hub validations
https://hubverse-org.github.io/hubValidations/
Other
1 stars 4 forks source link

hubValidations::validate_pr() fails with generic error [Error: The operation was canceled] #115

Open M-7th opened 3 weeks ago

M-7th commented 3 weeks ago

With the current quite complex tasks.json configuration, we are experiencing some issues with the validation phase. After running for a variable time ( from few minutes to even 4/6 hours, depending on the input file) the validation process, that runs inside a github actions, fails with a generic error Error: The operation was canceled. The runner has received a shutdown signal. Running the same validation process manually on a local machine, we notice a high memory consumption ( > 25 GB). In a couple of test we also get a more specific error:

── Individual check results ──

✔ ScenarioModellingHub: All hub config files are valid.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: File exists at path model-output/ISI-TestModel/2024_2025_1_FLU-ISI-TestModel.parquet.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: File name "2024_2025_1_FLU-ISI-TestModel.parquet" is valid.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: File directory name matches `model_id` metadata in file name.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: `round_id` is valid.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: File is accepted hub format.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: Metadata file exists at path model-metadata/ISI-TestModel.yml.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: File could be read successfully.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: `round_id_col` name is valid.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: `round_id` column "round_id" contains a single, unique round ID value.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: All `round_id_col` "round_id" values match submission `round_id` from file name.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: Column names are consistent with expected round task IDs and std column names.
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: Column data types match hub schema.
✖ 2024_2025_1_FLU-ISI-TestModel.parquet: EXEC ERROR: Error in purrr::map2(task_id_l, output_type_l, ~expand_output_type_grid(task_id_values = .x, : ℹ
  In index: 1. Caused by error: ! vector memory exhausted (limit reached?)
✔ 2024_2025_1_FLU-ISI-TestModel.parquet: Submission time is within accepted submission window for round.

── Overall validation result ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
✖ 2024_2025_1_FLU-ISI-TestModel.parquet: EXEC ERROR: Error in purrr::map2(task_id_l, output_type_l, ~expand_output_type_grid(task_id_values = .x, : ℹ
  In index: 1. Caused by error: ! vector memory exhausted (limit reached?)
Error in `hubValidations::check_for_errors()`:

We created a test repo with a configuration mirroring the one used in Respicompass and the same validation workflow here https://github.com/Testing-Forecast-Actions/TestingValidations

A file for testing the validation process through Pull Request is available under the folder Testing-Forecast-Actions/TestingValidations//submission-test-files

You can activate the validation process by forking the hub, copying the test file into the model-output/ISI-TestModel folder and contributing with a Pull Request

annakrystalli commented 3 weeks ago

Test repo: https://github.com/Testing-Forecast-Actions/TestingValidations