hubverse-org / hubValidations

Testing framework for hubverse hub validations
https://hubverse-org.github.io/hubValidations/
Other
1 stars 3 forks source link

Feature/ Handle V3 sample specification #82

Closed annakrystalli closed 2 weeks ago

annakrystalli commented 1 month ago

This PR implement and adds new tests for checking the validity of submissions of samples using the v3 schema sample spec. See v3 sample validation spec for details.

Specific Sample validation tests implemented (#80)

The key to the new functions check_tbl_spl_n(), check_tbl_spl_non_compound_tid() and check_tbl_spl_compound_tid() is a table of hashes on model output data joined to the output of the new hubData::expand_model_out_val_grid(include_sample_ids = TRUE), where the output type id column for v3 samples effectively contains the compound_idx. The hashes are calculated on the relevant subsets of values of each sample and aggregated/counted at the relevant level for each check, ie:

These checks are performed separately for each round modeling task item, allowing for differences between compound task id sets between round modeling tasks.

Still to do:

github-actions[bot] commented 1 month ago

🚀 Deployed on https://66728738769adaca9558e175--hubvalidations-pr-previews.netlify.app

annakrystalli commented 3 weeks ago

I was also debating if it makes sense to have somewhere in the documentation (function documentation and/or vignette) a warning saying that large files with a lot of samples might take time to validation. However, as we don't have a clear estimation on what is "large" and "take time", I am not sure how helpful it is.

While it would be useful, I also agree that as "large" and "take time" are hard to properly define, not sure just how useful. There are plans to try and improve the performance of validations though so as part of that work we might get a better sense of what would be useful in the documentation also? I'll draft the performance issue up today and make a note about including more on performance in the docs too.

annakrystalli commented 2 weeks ago

Firstly thanks so much for your review and thorough testing @LucieContamin ! It's been really useful to work through. In response I've made a number of changes to the functionality / docs:

Let me know if these resolve your issues for the time being and feel free to open more issues if you think there's more that needs to be addressed.