Mayurk619 commented 2 months ago

When I run his command Rscript kaiju2anvio.R gene_calls_nr.names gene_calls_nr-fixed.names I am getting the following error in terminal. I'm not able to understand the error. Kindly help.

Loading required package: parallel Error in cbind(as.matrix(kaiju.names[, 2]), mat) : number of rows of matrices must match (see arg 2) Calls: kaiju2mat -> cbind In addition: Warning message: In matrix(unlist(mclapply(1:nrow(kaiju.names), FUN = function(i) { : data length [2193721] is not a sub-multiple or multiple of the number of rows [313389] Execution halted

Mayurk619 commented 2 months ago

I solved it by changing the script.

#!/usr/bin/env Rscript
args = commandArgs(trailingOnly=TRUE)

# Input control
if (length(args) == 0) {
  stop("At least one argument must be supplied (input kaiju file).\n", call.=FALSE)
} else if (length(args) == 1) {
  # default output file
  args[2] = "kaiju2Anvio-fixed.names"
} else if (length(args) == 3) {
  parallel = args[3]

# Parallel package install control
if (!require("parallel")) install.packages("parallel")

# Function
kaiju2mat <- function(kaiju.names, parallel) {
  if (isTRUE(parallel)) {
    cores <- detectCores() - 1
  } else {
    cores <- parallel

  mat <- matrix(unlist(mclapply(1:nrow(kaiju.names), FUN = function(i) {
    if (kaiju.names[i, 8] != "") {
      x.tmp <- unlist(strsplit(as.character(kaiju.names[i, 8]), split = ";"))
      length(x.tmp) <- 7
    } else {
      x.tmp <- rep(NA, 7)
  }, mc.cores = cores)), ncol = 7, byrow = TRUE)

  if (nrow(mat) != nrow(kaiju.names)) {
    stop(paste("Mismatch in the number of rows between 'mat' (", nrow(mat), ") and 'kaiju.names' (", nrow(kaiju.names), ").\n", sep = ""))

  mat <- cbind(as.matrix(kaiju.names[, 2]), mat)
  colnames(mat) <- c("gene_callers_id", "t_domain", "t_phylum", "t_class", "t_order", "t_family", "t_genus", "t_species")

# __MAIN__
kaiju.names <- read.table(file = args[1], sep = "\t", fill = TRUE, row.names = NULL, header = FALSE, quote = "")

# Check if the expected number of columns is present
if (ncol(kaiju.names) < 8) {
  stop("Input file does not have the expected number of columns.\n", call.=FALSE)

kaijumat <- kaiju2mat(kaiju.names = kaiju.names, parallel = parallel)

# Write the output file with tab delimiters
write.table(kaijumat, file = args[2], quote = FALSE, col.names = TRUE, row.names = FALSE, sep = "\t")