immunomind / immunarch

🧬 Immunarch: an R Package for Fast and Painless Exploration of Single-cell and Bulk T-cell/Antibody Immune Repertoires
https://immunarch.com
Apache License 2.0
312 stars 66 forks source link

All clonal counts are set to 1 when column specified by ".count" is not found #417

Open Jia340 opened 1 month ago

Jia340 commented 1 month ago

Hi,

It appears that when using immunarch::repLoad(filename, .count = c("count_col_name"), all clonal counts are set to 1 if "count_col_name" is not found in the table (as shown in immunarch/R/io-parsers.R, line 682-691). It's mentioned in a message instead of an error or a warning so there's not a good way to capture that (e.g. use TryCatch).

This actually caused a lot of confusion in the real analysis as one can easily miss the warning and use the faked clonal counts in downstream analysis. I am wondering if the function can return an error/warning message directly, so that users know the column is not found and can double check their code? Thank you!

if (!any(.count %in% table.colnames)) {
    warn_msg <- c("  [!] Warning: can't find a column with clonal counts. Setting all clonal counts to 1.")
    warn_msg <- c(warn_msg, "\n      Did you apply repLoad to MiXCR file *_alignments.txt?")
    warn_msg <- c(warn_msg, " If so please consider moving all *.clonotypes.*.txt MiXCR files to")
    warn_msg <- c(warn_msg, " a separate folder and apply repLoad to the folder.")
    warn_msg <- c(warn_msg, "\n      Note: The *_alignments.txt file IS NOT a repertoire file suitable for any analysis.")
    message(warn_msg)

    .count <- .count[1]
    df[[.count]] <- 1
  }