Bioconductor / Biostrings

Efficient manipulation of biological strings
https://bioconductor.org/packages/Biostrings
57 stars 16 forks source link

Unexpected behaviour in pairwiseAlignment() #12

Closed jergosh closed 5 years ago

jergosh commented 6 years ago

It appears that pairwiseAligment behaves as if it was always performing local alignment, no matter what value I set in the type parameter. Just taking the example from https://www.bioconductor.org/packages/3.7/bioc/vignettes/Biostrings/inst/doc/PairwiseAlignments.pdf

pairwiseAlignment(pattern = c("succeed", "precede"), subject = "supersede", gapOpening = 0, gapExtension = 1)

what I'm getting:

Global PairwiseAlignmentsSingleSubject (1 of 2)
pattern: [1] su-cce--ed 
subject: [1] sup--ersed 
score: 7.945507 

whereas the expected result (taken from the pdf I linked) would be:

Global PairwiseAlignmentsSingleSubject (1 of 2)
pattern: su-cce--ed-
subject: sup--ersede
score: 7.945507

My session info:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS High Sierra 10.13.3

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] BiocInstaller_1.28.0 seqinr_3.4-5         reshape2_1.4.3       bio3d_2.3-4          plyr_1.8.4           ggplot2_2.2.1        stringr_1.2.0        Biostrings_2.46.0   
 [9] XVector_0.18.0       IRanges_2.12.0       S4Vectors_0.16.0     BiocGenerics_0.24.0  biomaRt_2.34.2      

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15         pillar_1.1.0         compiler_3.4.3       prettyunits_1.0.2    bitops_1.0-6         tools_3.4.3          progress_1.1.2       zlibbioc_1.24.0     
 [9] digest_0.6.14        bit_1.1-12           RSQLite_2.0          memoise_1.1.0        tibble_1.4.1         gtable_0.2.0         rlang_0.1.6          DBI_0.7             
[17] curl_3.1             yaml_2.1.16          httr_1.3.1           ade4_1.7-11          bit64_0.9-7          grid_3.4.3           Biobase_2.38.0       R6_2.2.2            
[25] AnnotationDbi_1.40.0 XML_3.98-1.9         blob_1.1.0           magrittr_1.5         MASS_7.3-48          scales_0.5.0         assertthat_0.2.0     colorspace_1.3-2    
[33] labeling_0.3         stringi_1.1.6        RCurl_1.95-4.10      lazyeval_0.2.1       munsell_0.4.3        

Many thanks, Greg

hpages commented 5 years ago

Hi @jergosh,

This was only the show() method not displaying the entire thing to save screen space (admittedly a questionable feature). But calling writePairwiseAlignments() on the object returned by pairwiseAlignment() would have showed you that the alignment is correct.

Anyway the show() method was fixed a while ago in Biostrings:

> pairwiseAlignment(pattern="succeed", subject="supersede", gapOpening=0, gapExtension=1)
Global PairwiseAlignmentsSingleSubject (1 of 1)
pattern: su-cce--ed-
subject: sup--ersede
score: 7.945507 

Sorry for the slow answer.

Best, H.

> sessionInfo()
R version 3.6.0 Patched (2019-05-02 r76454)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-3.6.r76454/lib/libRblas.so
LAPACK: /home/hpages/R/R-3.6.r76454/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] Biostrings_2.53.2   XVector_0.25.0      IRanges_2.19.17    
[4] S4Vectors_0.23.25   BiocGenerics_0.31.6

loaded via a namespace (and not attached):
[1] zlibbioc_1.31.0 compiler_3.6.0  tools_3.6.0