dwinter / pafr

Read, manipulate and visualize 'Pairwise mApping Format' data in R
Other
67 stars 10 forks source link

plot_synteny() breaks for query/target with same name #25

Open philippbayer opened 3 years ago

philippbayer commented 3 years ago

Great package! It looks like for two alignments where the names are identical, plot_synteny() does not work correctly. In my case I am comparing two chromosomes of the same cultivar in two different assemblies, so the chromosome names have to be identical. The same thing happens when you self-align an assembly.

Minimal example:

wget https://www.arabidopsis.org/download_files/Genes/TAIR10_genome_release/TAIR10_chromosome_files/TAIR10_chr_all.fas
samtools faidx TAIR10_chr_all.fas
samtools faidx TAIR10_chr_all.fas 1 > chr1.fa
minimap2  -x asm5 -t 28 chr1.fa chr1.fa > self.paf

Then in R:

af <- read_paf('./self.paf')
plot_synteny(af, q_chrom="1", t_chrom="1", centre=F) +
     theme_bw()

It now looks like this: image

1 should appear twice and there should be one big connection between both with a bunch of repeats causing problems.

A workaround is to rename query and subject chromosomes.

af$qname <- rep('Query: 1', length(af$qname))
af$tname <- rep('Target: 1', length(af$tname))
plot_synteny(af, q_chrom="Query: 1", t_chrom="Target: 1", centre=F) + theme_bw()

image

sessionInfo():

R version 4.0.3 (2020-10-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17134)

Matrix products: default

locale:
[1] LC_COLLATE=English_Australia.1252  LC_CTYPE=English_Australia.1252    LC_MONETARY=English_Australia.1252
[4] LC_NUMERIC=C                       LC_TIME=English_Australia.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pafr_0.0.2    ggplot2_3.3.3

loaded via a namespace (and not attached):
 [1] RColorBrewer_1.1-2 pillar_1.6.0       compiler_4.0.3     highr_0.9          tools_4.0.3        digest_0.6.27     
 [7] evaluate_0.14      lifecycle_1.0.0    tibble_3.1.1       gtable_0.3.0       pkgconfig_2.0.3    rlang_0.4.10      
[13] DBI_1.1.1          yaml_2.2.1         xfun_0.22          withr_2.4.2        dplyr_1.0.5        stringr_1.4.0     
[19] knitr_1.32         generics_0.1.0     vctrs_0.3.7        cowplot_1.1.1      grid_4.0.3         tidyselect_1.1.0  
[25] glue_1.4.2         R6_2.5.0           fansi_0.4.2        rmarkdown_2.7      purrr_0.3.4        farver_2.1.0      
[31] magrittr_2.0.1     scales_1.1.1       ellipsis_0.3.1     htmltools_0.5.1.1  assertthat_0.2.1   colorspace_2.0-0  
[37] labeling_0.4.2     utf8_1.2.1         stringi_1.5.3      munsell_0.5.0      crayon_1.4.1