We could store the data sets on zenodo (like figshare). @JamieJQuinn and @ageorgou used this for larger test datasets for TROVE. This worked really well, allowed versioning of datasets. The datasets can be downloaded through the zenodo API (see https://github.com/Trovemaster/TROVE/blob/feature/unit-testing/test/regression/download_benchmarks.sh for an example). Didn't get round to caching in GHA but others have pointed to the cache action which is fab. Only problem we found with zenodo is it's really hard for multiple users to "own" a dataset so collaboration is tricky.
We could store the data sets on zenodo (like figshare). @JamieJQuinn and @ageorgou used this for larger test datasets for TROVE. This worked really well, allowed versioning of datasets. The datasets can be downloaded through the zenodo API (see https://github.com/Trovemaster/TROVE/blob/feature/unit-testing/test/regression/download_benchmarks.sh for an example). Didn't get round to caching in GHA but others have pointed to the cache action which is fab. Only problem we found with zenodo is it's really hard for multiple users to "own" a dataset so collaboration is tricky.
The size limit is 50Gb per dataset