broadinstitute / gdctools

Python and UNIX CLI utilities to simplify interaction with the NIH/NCI Genomics Data Commons
Other
31 stars 4 forks source link

tests: we should use vastly smaller subset of data for testing #11

Closed noblem closed 7 years ago

noblem commented 7 years ago

Right now the download incurred when running "make test" is substantial. I think there is an email or RFE somewhere which asked to add the ability to subset the data (to mirror/dice/etc) by specifying individual cases. Yes, I see now that this has been added to GDCcli ... great! We should sprinkle use of this in the tests, to greatly reduce the turnaround time for "make test" to complete.

Ideally the capacity to subset by case should also be reflected in the config file mechanism, like

[mirror] ... CASES =

That should be sufficient to start, because if a mirror contained only a subset of cases then that subsetting effect would propagate to dicing/loadfiles/reports.

noblem commented 7 years ago

Fixed. And cases can be specified via config file, too, and is referenced in gdc_mirror and gdc_dice