ShixiangWang / sigminer

🌲 An easy-to-use and scalable toolkit for genomic alteration signature (a.k.a. mutational signature) analysis and visualization in R https://shixiangwang.github.io/sigminer/reference/index.html
https://shixiangwang.github.io/sigminer/
Other
144 stars 18 forks source link

Issue with read_vcf function #464

Closed arianlundberg closed 1 month ago

arianlundberg commented 2 months ago

Hi,

I attempted to load my VCF file using the read_vcf function. My file includes columns for Variant_Classification, Gene_ID, and Hugo_Symbol, but the function doesn’t seem to recognize them. After reviewing your code, I noticed that you set these fields to "Unknown" by default, and there doesn’t appear to be an option to change this behavior. Specifically, I’m referring to this part of the code:

vcfs$Variant_Classification <- "Unknown" vcfs$Hugo_Symbol <- "Unknown" vcfs$Gene_ID <- "Unknown"

Would it be possible to modify the code to something like the following? This way, if the information exists, it will be used; otherwise, it will default to "Unknown":

vcfs$Variant_Classification <- vcfs$Variant_Classification vcfs$Hugo_Symbol <- vcfs$Hugo_Symbol vcfs$Gene_ID <- vcfs$Gene_ID

Thanks

ShixiangWang commented 2 months ago

@arianlundberg Hi, thanks for using this package. The function is for mutational signature analysis without knowing the gene info (mostly unknown) from vcf file. If you want to customize the function, you can create a version of yourself.

arianlundberg commented 2 months ago

@ShixiangWang Thank you for your response. I tried to re-write/modify the code however, there is a custom function that couldn't be recognised by R, "file_name" function, is it from a package ("mark" package also has this function), or you generated it? Thanks.

ShixiangWang commented 2 months ago

@arianlundberg You can search the code with: https://github.com/search?q=repo%3AShixiangWang%2Fsigminer+file_name&type=code

file_name <- function(filepath, must_chop = NULL) {
  y <- sub(pattern = "(.*)\\..*$", replacement = "\\1", basename(filepath))
  if (!is.null(must_chop)) {
    y <- sub(must_chop, "", y)
  }
  y
}