nih-cfde / update-content-registry

Code and workflows for adding content to the content registry.
https://app-staging.nih-cfde.org/
BSD 3-Clause "New" or "Revised" License
0 stars 3 forks source link

options for testing on say 10 resources max #53

Open raynamharris opened 2 years ago

raynamharris commented 2 years ago

the content registry is growing in size with annotations for compounds, disease, anatomies, proteins, and genes. so the commands make clean, make, and make update take quite a while to run for thousands of resources.

i've been saving short lists with only 10 IDs to use for input during development, but i delete these short lists once i'm happy with the results. the challenge then is that when i develop new rules, all the old rules have really large inputs. some things i have been doing during testing include going through and manually change the rule inputs to short test lists, removing/commenting out all the irrelevant vocabularies from the TERM_TYPES list and the output directories from the aggregate rule. the problem with this approach is that it creates lots of git conflicts between branches.

is there a way to create a test rule or test reference set that automatically limits the number of inputs to say 10 that way you can keep all the inputs as they are but only alter a specific set of 10 ids for any give vocabulary term that you can easily refer to see if they are working. and you don't have to worry about accidentally refreshing the content on 10000 compounds when you were trying to update 10 genes.