claraqin / neonMicrobe

Processing NEON soil microbe marker gene sequence data into ASV tables.
GNU Lesser General Public License v3.0
9 stars 4 forks source link

Simplify directory construction using system commands #14

Closed zoey-rw closed 4 years ago

zoey-rw commented 4 years ago

The code to create a directory structure, in the server setup script, could be simplified by wrapping into a function and using system() calls. mkdir -p will create a directory if it does not exist, but will not overwrite or throw errors if there is already an existing directory. Below is an example of how I have used this before (not updated for the specific server setup of this pipeline).

#  Load "tree" command by JennyBC
# twee vignette here: https://gist.github.com/jennybc/2bf1dbe6eb1f261dfe60
temp <- RCurl::getURL("https://gist.githubusercontent.com/jennybc/2bf1dbe6eb1f261dfe60/raw/c53fba8a861f82f90d895458e941a1056c8118f5/twee.R", ssl.verifypeer=F)
eval(parse(text = temp))

# Create data directory
create_data_directory <- function(path = ".", amplicon = c("ITS", "16S"), ...) {
  if (length(amplicon) == 1) {
    cmd <- paste0("mkdir -p data/{filt_seqs/",amplicon,",raw_seqs/",amplicon,",trimmed_seqs/",amplicon,",seq_tables/",amplicon,",output_files/",amplicon,"}")
  } else {
    cmd <- "mkdir -p data/{filt_seqs/{ITS,16S},raw_seqs/{ITS,16S},trimmed_seqs/{ITS,16S},seq_tables/{ITS,16S},output_files/{ITS,16S}}"
  }
  system(cmd)
  twee("data/", level = 2)
}

The twee() function prints a directory tree to illustrate file structures, for example:

> twee("./data/NEON_DOB/", level=2)
 Illumina
   |__NEON
-- sequence_metadata
   |__filesToStack10108
   |__mmg_soilMetadata_16S_2020-08-06.csv
   |__mmg_soilMetadata_ITS_2020-08-06.csv
-- soil
   |__filesToStack10078
   |__filesToStack10086
-- soilDatabase.rds
-- soilDB.db
claraqin commented 4 years ago

Thanks Zoey! If we use the system command to make directories, will there be compatibility issues between Windows and Linux-based systems? Are there downsides to using the dir.create function in R?

claraqin commented 4 years ago

Finished, in the most recent commit.