When going through the code, the log2 cpm values are all 0 and the cpm values are only have 34641 out of 165860 that match/are in line with the original code.
Description of the fix
Using apply(counts, 2, function(x) (x/new_data$counts_per_sample)*1e6) means that every calculation was using the first sample count from new_data$counts_per_sample as the divisor. When comparing this code to the original, use of new_data$counts_per_sample leads to only 34641 of the 165850 values matched between the two repos. Replacing the pre-calculated sums with sum(x) brought them into alignment.
For the log2 cpm calculation log2(new_data$cpm +1) leads to all zeros because there is no data in new_data$cpm. Replacing it with new_data$transformed_data$cpm points to the correct information and fixes this problem.
These changes were originally made on my qc branch in early Feb and buried under the rest of the changes, so suggesting them here separately so they can be incorporated now.
Type of change
[ ] Bug fix (non-breaking change which fixes an issue)
How Has This Been Tested?
Tested locally by running the original repo's code side by side with the new code and comparing outputs
Description of the issue
When going through the code, the log2 cpm values are all 0 and the cpm values are only have 34641 out of 165860 that match/are in line with the original code.
Description of the fix
Using
apply(counts, 2, function(x) (x/new_data$counts_per_sample)*1e6)
means that every calculation was using the first sample count fromnew_data$counts_per_sample
as the divisor. When comparing this code to the original, use ofnew_data$counts_per_sample
leads to only 34641 of the 165850 values matched between the two repos. Replacing the pre-calculated sums withsum(x)
brought them into alignment.For the log2 cpm calculation
log2(new_data$cpm +1)
leads to all zeros because there is no data innew_data$cpm
. Replacing it withnew_data$transformed_data$cpm
points to the correct information and fixes this problem.These changes were originally made on my qc branch in early Feb and buried under the rest of the changes, so suggesting them here separately so they can be incorporated now.
Type of change
How Has This Been Tested?
Tested locally by running the original repo's code side by side with the new code and comparing outputs