jpuntomarcos / CNVfilteR

R package to remove false positives of CNV calling tools by using SNV calls
5 stars 1 forks source link

Error in vcfs[[cnvs.df[i, "sample"]]] : subscript out of bounds #11

Closed sunjh22 closed 1 year ago

sunjh22 commented 1 year ago

Hi @jpuntomarcos ,

I came across some unexpected issues when using CNVfilteR. I loaded the cnv regions and vcf files as instructed, but at the last step, an error is throw out as the title of this issue. Following is the command ran in R: `> library(CNVfilteR)

cnv_file = "test/test.bed" vcf_file = "test/test.vcf.gz" cnv_gr <- loadCNVcalls(cnvs.file = cnv_file, chr.column = 'chromosome', start.column = 'start', end.column = 'end', cnv.column = 'cnv', sample.column = 'sample', genome = 'hg38')

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

Filter, Find, Map, Position, Reduce, anyDuplicated, aperm, append,
as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which.max, which.min

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

I, expand.grid, unname

Attaching package: 'Biostrings'

The following object is masked from 'package:base':

strsplit

The masked version of 'hg38' is not installed. Using the unmasked version. This means that no automatic masking will be available.

temp_cnv_gr <- trim(cnv_gr) vcfs <- loadVCFs(vcf.files = vcf_file, cnvs.gr = temp_cnv_gr, min.total.depth = 5, vcf.source = "HaplotypeCaller", genome = 'hg38') Scanning file /tmp/RtmpDR8RTd/test.vcf.gz... HaplotypeCaller was found as source in the VCF metadata, AD will be used as allele support field in a list format: ref allele, alt allele. Warning messages: 1: In .bcfHeaderAsSimpleList(header) : duplicate keys in header will be forced to unique rownames 2: In .bcfHeaderAsSimpleList(header) : duplicate keys in header will be forced to unique rownames cnv_filter <- filterCNVs(temp_cnv_gr, vcfs) Error in vcfs[[cnvs.df[i, "sample"]]] : subscript out of bounds`

I attached some data for your testing, could you give some suggestions about this issue? Is it due to the uninstallation of masked version of 'hg38'?

Thank you in advance.

Jiahong Sun test.zip

jpuntomarcos commented 1 year ago

Hi @sunjh22

Thanks for the clear question and the data provided to test it. I have found out the problem: there is a small bug regarding the sample column type in the loadCNVcalls() method.

I will fix it and publish a new version this week.

sunjh22 commented 1 year ago

Thanks for your quick response and help! I will close this issue now.