diskin-lab-chop / AutoGVP

17 stars 3 forks source link

Bug: duplicated columns in autogvp output #146

Closed rjcorb closed 1 year ago

rjcorb commented 1 year ago

Provide the command used or report the bug here

There are currently duplicated column names in the output of 01-annotate_variants_*_input.R. This happens when intervar and multianno files are merged:

> names(clinvar_anno_intervar_vcf_df)
 [1] "Ref.Gene"                        "Func.refGene.x"                  "ExonicFunc.refGene.x"            "Gene.ensGene.x"                  "avsnp147.x"                     
 [6] "AAChange.ensGene.x"              "AAChange.refGene.x"              "InterVar: InterVar and Evidence" "Interpro_domain.x"               "AAChange.knownGene.x"           
[11] "Otherinfo"                       "var_id"                          "Start"                           "Func.refGene.y"                  "Gene.refGene"                   
[16] "GeneDetail.refGene"              "ExonicFunc.refGene.y"            "AAChange.refGene.y"              "esp6500siv2_all"                 "1000g2015aug_all"               
[21] "avsnp147.y"                      "Aloft_Confidence"                "integrated_confidence_value"     "LINSIGHT"                        "GERP++_NR"                      
[26] "GERP++_RS"                       "SiPhy_29way_logOdds"             "Interpro_domain.y"               "rmsk"                            "Func.ensGene"                   
[31] "Gene.ensGene.y"                  "GeneDetail.ensGene"              "ExonicFunc.ensGene"              "AAChange.ensGene.y"              "Func.knownGene"                 
[36] "Gene.knownGene"                  "GeneDetail.knownGene"            "ExonicFunc.knownGene"            "AAChange.knownGene.y"            "vcf_id"                         
[41] "evidencePVS1"                    "evidenceBA1"                     "evidencePS"                      "evidencePM"                      "evidencePP"                     
[46] "evidenceBS"                      "evidenceBP"                      "CHROM"                           "START"                           "ID"                             
[51] "REF"                             "ALT"                             "QUAL"                            "FILTER"                          "INFO"                           
[56] "FORMAT"                          "Sample"                          "Stars"                           "final_call"   

This likely was overlooked when we modified how the two data frames are merged.

What version are you using?

Add error message here (if applicable)

Add Session info

Run sessionInfo() and post the output below

> sessionInfo()
R version 4.2.3 (2023-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.2 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] vroom_1.6.0     optparse_1.7.3  lubridate_1.9.2 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.1     purrr_1.0.1     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1    ggplot2_3.4.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] pillar_1.9.0     compiler_4.2.3   tools_4.2.3      bit_4.0.5        lifecycle_1.0.3  gtable_0.3.3     timechange_0.2.0 pkgconfig_2.0.3  rlang_1.1.0      cli_3.6.1        rstudioapi_0.14  parallel_4.2.3  
[13] withr_2.5.0      generics_0.1.3   vctrs_0.6.2      hms_1.1.3        getopt_1.20.3    rprojroot_2.0.3  bit64_4.0.5      grid_4.2.3       tidyselect_1.2.0 glue_1.6.2       R6_2.5.1         fansi_1.0.4     
[25] tzdb_0.3.0       magrittr_2.0.3   scales_1.2.1     colorspace_2.1-0 utf8_1.2.3       stringi_1.7.12   munsell_0.5.0    crayon_1.5.2