KentonWhite / ProjectTemplate

A template utility for R projects that provides a skeletal project.
http://projecttemplate.net
GNU General Public License v3.0
622 stars 159 forks source link

data_ignore does not ignore cached files #289

Closed alsmnn closed 5 years ago

alsmnn commented 5 years ago

Report an Issue / Request a Feature

data_ignore does not ignore cached files. If you cache files which need excessive computation prior caching and you don't have a file in data/ with the same name, you can't use data_ignore in order to ignore them while loading a project.

example: ESCA_se is an entity of cancer samples from The Cancer Genome Atlas and there are 33 of them. The download and normalization process is quite ressource intense and needs lots of on-wall-time, therefore I cache the normalized objects, but I don't need every cached object in every analysis. So I would like to determine, which cached files should be loaded.

I have the files ESCA_se.RData and ESCA_se.hash in my cache/ directory and there is no corresponding file in data/. I don't need ESCA_se for every analysis and it is quite big. So I want to make shure, that it isn't loaded with reload.project(list(data_ignore = "ESCA*")), but that won't work.

I'm submitting a (Check one with "x") :


Issue Severity Classification -

(Check one with "x") :

Expected Behavior

data_ignore in global.dcf should ignore cached files too

Current Behavior

data_ignore in global.dcf just ignores cached files, which have a corresponding file in data/

Steps to Reproduce Behavior

cache a file which does not have a corresponding file in data/ and reload.project(list(data_ignore = "FILENAME"))

Screenshots
Version Information
          Package           Version 
"ProjectTemplate"           "0.8.2" 
Possible Solution

-/-

Best regards, @AljoLe