hubverse-org / hubValidations

Testing framework for hubverse hub validations
https://hubverse-org.github.io/hubValidations/
Other
1 stars 3 forks source link

Do not attempt to validate non-relevant files (like READMEs) #46

Closed annakrystalli closed 9 months ago

annakrystalli commented 9 months ago

The following PR to add to the README in model-output triggered an unwanted (and unsuccessful) validation attempt on the README file. https://github.com/cdcepi/FluSight-forecast-hub/actions/runs/6391216357/job/17346023767?pr=16

We obviously want to avoid this behaviour but not 100% sure whether the best way is to: 1) Only validate files with specific valid extensions (e.g. in model-output: csv, parquet or arrow and in model-metadata: yaml or yml. PROs: specific, CONS: Is somewhat doing an upfront extension validation/filtering. Would allow for polluting the directories with ignored invalid files 2) Ignore md, Rmd and perhaps txt files when validating PR? PROs: will only ignore specific types of files and catch any stray invalid files. CONS: would hubs likely want to legitimately store other types of files in said folder (that would trigger a validation)? Is this something we want to discourage in any case?

Is this something you think hub admins should have some control over, i.e. we need to provide some options?

Thoughts @elray1 @nickreich or anyone else?

nickreich commented 9 months ago

Another more specific option could be to just ignore any README file of any type (e.g. md, Rmd, txt, ...). I agree that we don't really want to be in the business of allowing lots of junky files in the folders, but having a README exception seems like a reasonable way to go.

I wouldn't protest implementing either of the options above either, but I think would prefer some very limited exception.

annakrystalli commented 9 months ago

I like your recommendation @nickreich ! I'll go with that.