hubverse-org / hubData

Tools for accessing and working with hubverse Hub data
https://hubverse-org.github.io/hubData/
Other
3 stars 4 forks source link

connect_hub and connect_model_output: add option to skip data checks for model-output files #43

Open bsweger opened 2 weeks ago

bsweger commented 2 weeks ago

Background

This is a follow up to the performance investigation related to cloud-based hubs: #37

When researching the above, we found two places in connect_hub and connect_model_output that affect performance when using hubData with larger hubs based on S3:

To ensure users of cloud-based hubs can connect in a reasonable amount of time, we decided to add an optional parameter to connect_hub and connect_model_output that will specify whether or not to perform the above checks. When not supplied by the user, the default behavior will be to skip them when working with cloud-based hubs.

Definition of done

bsweger commented 2 weeks ago

I'd like to tackle this one, and get some R reps!