Sample names indexing - Githubissues

Hi @RCollins13 ,

I recently ran into an error as follows:

Sample ID file 'Sample01' not found, assuming single sample ID provided
Filtering & loading coverage matrix... Complete
Error in .subset(x, j) : invalid subscript type 'list'
Calls: CNView -> [ -> [.data.frame
Execution halted

I tracked the error to lines 130-134 of the current CNView.R code

  ##Drop Columns to Specified Sample Size##
  cov <- cov[,unique(c(1:3,
                       as.vector(sapply(head(unique(c(sampleID,sample(names(cov[,-c(1:3)])))),n=subsample),
                                        function(val){grep(val,colnames(cov),ignore.case=T)}))))]

I noticed that the current method of subsetting uses grep which was problematic for my sample names because my coverage bed including sample names like Sample11, Sample 111, Sample112, etc., and grep matched Sample11 to Sample11, Sample111, Sample112, etc.

I changed the code in those lines to the following:

  cov <- cov[,unique(c(1:3,
                       as.vector(sapply(head(unique(c(sampleID,sample(names(cov[,-c(1:3)])))),n=subsample),function(val){which(val==colnames(cov))}))))]

using which() instead of grep() to index, and this corrected the inappropriate multimatching of sample names, resolving the problem. You may want to consider changing the way you index samples (from sample name) in lines 130-134 of CNView.R

Best, Steve

RCollins13 / CNView

Sample names indexing #6