BU-ISCIII / WGS-Outbreaker

Pipeline for whole genome sequencing analysis for outbreak detection and characterization of foodborne bacteria
https://github.com/BU-ISCIII/WGS-Outbreaker/wiki
GNU General Public License v3.0
5 stars 2 forks source link

BUG: generating stats may include duplicated values to fill coverage table #2

Open MiguelJulia opened 5 years ago

MiguelJulia commented 5 years ago

In graphs_coverage.R, lines 48 and 49, for some samples may not be the same number of values, causing it to fill the missing values with the fist ones in the array again when filling the coverage table.

Example with wgs_course training_dataset:

# Line 47
cov_genome <- cov_graph[cov_graph$chr == "genome",]

At this point, cov_genome is a data.frame with no NA values and dimension 353x4.

# Line 48
cov_table <- by(cov_genome,cov_genome[,"sample"],function(x) x$fracAboveThreshold[x$covThreshold == 10 | x$covThreshold==20 | x$covThreshold== 30 | x$covThreshold == 50| x$covThreshold == 70| x$covThreshold == 100])

Now, cov_table is a list of 10 elements, with 3 numerical values each BUT item 5, which only has 2.

# Line 49
cov_table <- do.call(rbind,cov_table)
Warning message:
In (function (..., deparse.level = 1)  :
  number of columns of result is not a multiple of vector length (arg 5)

The new table is being completed by replicating the first value of element 5 to fill the empty value.

MiguelJulia commented 5 years ago

Also, for the same dataset I am using, dimension of cov_table is 10x3, while in line 50 it is assumed to be 10x6, which causes an execution error for wgs_course pipeline.