roblanf / sangeranalyseR

functions to analyse sanger sequencing reads in R
MIT License
95 stars 24 forks source link

Error loading ab1 data #47

Closed DeepakVeerappan closed 4 years ago

DeepakVeerappan commented 4 years ago

Hi, I got this error message, could you please help.

Error in validObject(.Object) : invalid class “SangerAlignment” object: 1: invalid object for slot "contigsConsensus" in class "SangerAlignment": got class "function", should be or extend class "DNAStringORNULL" invalid class “SangerAlignment” object: 2: invalid object for slot "contigsTree" in class "SangerAlignment": got class "NULL", should be or extend class "phylo"

That's how i have labelled the ab1 files "Nat_MT_22_F.ab1"

Thanks, Deepak

Kuanhao-Chao commented 4 years ago

Hi @Saradasuperba

What’s your computer’s operating system? Could you paste your R session information here (run sessionInfo())? And could you send me your data through email so that we could test sangeranalsyeR. Thanks!

DeepakVeerappan commented 4 years ago

Thanks, Here it is. Please share your email id R version 4.0.0 (2020-04-24) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS High Sierra 10.13.6

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

Kuanhao-Chao commented 4 years ago

This is my email: kuanhao.chao@gmail.com

Kuanhao-Chao commented 4 years ago

Hi @Saradasuperba

  1. If you want to build two contigs and then align two contigs
    Contig 1:
       Nat_MT_22_F.ab1
       Nat_MT_22_R.ab1
    Contig 1:
       Nat_MT_23_F.ab1
       Nat_MT_23_R.ab1

    Then you can run analysis on sangerAlignment level:

    parentDir <- "/path/to/sangerrdata"
    suffixForwardRegExp <- "_F.ab1"
    suffixReverseRegExp <- "_R.ab1"
    sangerAlignment <- new("SangerAlignment",
                       inputSource           = "ABIF",
                       parentDirectory       = parentDir,
                       suffixForwardRegExp   = suffixForwardRegExp,
                       suffixReverseRegExp   = suffixReverseRegExp,
                       TrimmingMethod        = "M1",
                       M1TrimmingCutoff      = 0.0001,
                       M2CutoffQualityScore  = NULL,
                       M2SlidingWindowSize   = NULL,
                       baseNumPerRow         = 100,
                       heightPerRow          = 200,
                       signalRatioCutoff     = 0.33,
                       showTrimmed           = TRUE,
                       processorsNum         = 2)
    launchApp(sangerAlignment)

  1. If you want to build one contig with four reads:
    Contig 1:
    Nat_MT_22_F.ab1
    Nat_MT_22_R.ab1
    Nat_MT_23_F.ab1
    Nat_MT_23_R.ab1

    Then you can run analysis on sangerContig level:

    parentDir <- "/path/to/sangerrdata"
    suffixForwardRegExp <- "_[0-9]*_F.ab1"
    suffixReverseRegExp <- "_[0-9]*_R.ab1"
    sangerContig <- new("SangerContig",
                       inputSource           = "ABIF",
                       parentDirectory       = parentDir,
                       contigName            = contigName,
                       suffixForwardRegExp   = suffixForwardRegExp,
                       suffixReverseRegExp   = suffixReverseRegExp,
                       TrimmingMethod        = "M1",
                       M1TrimmingCutoff      = 0.0001,
                       M2CutoffQualityScore  = NULL,
                       M2SlidingWindowSize   = NULL,
                       baseNumPerRow         = 100,
                       heightPerRow          = 200,
                       signalRatioCutoff     = 0.33,
                       showTrimmed           = TRUE,
                       processorsNum         = 2)
    launchApp(sangerAlignment)

If you have any problem, please feel free to contact me! Thanks

Cheers,

Howard

DeepakVeerappan commented 4 years ago

Thanks a lot Howard! It works now.

Cheers, Deepak

kevin-wamae commented 4 years ago

Dear @Kuanhao-Chao, I'm running into the same problem as above and while I followed your suggestion, I still cannot get the package to work.

#Platform R version 4.0.0 (2020-04-24) Platform: x86_64-apple-darwin17.0 (64-bit)

#SessionInfo

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
 [1] tools     stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] sangeranalyseR_0.99.22 logger_0.1             BiocStyle_2.17.0       seqinr_3.6-1           kableExtra_1.1.0       rmarkdown_2.2          openxlsx_4.1.5         shinyWidgets_0.5.3     ggdendro_0.1-20       
[10] shinycssloaders_0.3    excelR_0.4.0           zeallot_0.1.0          DT_0.13                plotly_4.9.2.1         ggplot2_3.3.1          data.table_1.12.8      shinyjs_1.1            shinydashboard_0.7.1  
[19] shiny_1.4.0.2          gridExtra_2.3          sangerseqR_1.25.0      phangorn_2.5.5         reshape2_1.4.4         DECIPHER_2.17.1        RSQLite_2.2.0          Biostrings_2.57.1      XVector_0.29.1        
[28] IRanges_2.23.8         S4Vectors_0.27.11      BiocGenerics_0.35.4    ape_5.4                stringr_1.4.0         

loaded via a namespace (and not attached):
 [1] nlme_3.1-147        bit64_0.9-7         webshot_0.5.2       httr_1.4.1          R6_2.4.1            DBI_1.1.0           lazyeval_0.2.2      colorspace_1.4-1    ade4_1.7-15         withr_2.2.0        
[11] tidyselect_1.1.0    bit_1.1-15.2        compiler_4.0.0      rvest_0.3.5         xml2_1.3.2          scales_1.1.1        readr_1.3.1         quadprog_1.5-8      digest_0.6.25       pkgconfig_2.0.3    
[21] htmltools_0.4.0     fastmap_1.0.1       htmlwidgets_1.5.1   rlang_0.4.6         rstudioapi_0.11     generics_0.0.2      jsonlite_1.6.1      dplyr_1.0.0         zip_2.0.4           magrittr_1.5       
[31] Matrix_1.2-18       Rcpp_1.0.4.6        munsell_0.5.0       lifecycle_0.2.0     yaml_2.2.1          stringi_1.4.6       MASS_7.3-51.5       zlibbioc_1.35.0     plyr_1.8.6          grid_4.0.0         
[41] blob_1.2.1          promises_1.1.0      crayon_1.3.4        lattice_0.20-41     hms_0.5.3           knitr_1.28          pillar_1.4.4        igraph_1.2.5        fastmatch_1.1-0     glue_1.4.1         
[51] evaluate_0.14       BiocManager_1.30.10 vctrs_0.3.1         httpuv_1.5.3.1      gtable_0.3.0        purrr_0.3.4         tidyr_1.1.0         xfun_0.14           mime_0.9            xtable_1.8-4       
[61] later_1.1.0.1       viridisLite_0.3.0   tibble_3.0.1        tinytex_0.23        memoise_1.1.0       ellipsis_0.3.1     

#Code

parentDir <- "readsInput/"
suffixForwardRegExp <- "_F"
suffixReverseRegExp <- "_R"
sangerAlignment <- new("SangerAlignment",
                       inputSource           = "ABIF",
                       parentDirectory       = parentDir,
                       suffixForwardRegExp   = suffixForwardRegExp,
                       suffixReverseRegExp   = suffixReverseRegExp,
                       TrimmingMethod        = "M1",
                       M1TrimmingCutoff      = 0.0001,
                       M2CutoffQualityScore  = NULL,
                       M2SlidingWindowSize   = NULL,
                       baseNumPerRow         = 100,
                       heightPerRow          = 200,
                       signalRatioCutoff     = 0.33,
                       showTrimmed           = TRUE,
                       processorsNum         = 2
                       )

#Input files

geneEBA_F1.ab1
geneEBA_F2.ab1
geneEBA_F3.ab1
geneEBA_F4.ab1
geneEBA_F5.ab1
geneEBA_F6.ab1
geneEBA_R1.ab1
geneEBA_R2.ab1
geneEBA_R3.ab1
geneEBA_R4.ab1
geneEBA_R5.ab1
geneEBA_R6.ab1

#Error

Error in validObject(.Object) : 
  invalid class “SangerAlignment” object: 1: invalid object for slot "contigsConsensus" in class "SangerAlignment": got class "function", should be or extend class "DNAStringORNULL"
invalid class “SangerAlignment” object: 2: invalid object for slot "contigsTree" in class "SangerAlignment": got class "NULL", should be or extend class "phylo"
roblanf commented 4 years ago

HI @kevin-wamae,

I think the issue here is your regex pattern matching.

Assuming that you want to pair your files as follows:

geneEBA_F1.ab1    contig1
geneEBA_F2.ab1    contig2
geneEBA_F3.ab1    contig3
geneEBA_F4.ab1    contig4
geneEBA_F5.ab1    contig5
geneEBA_F6.ab1    contig6
geneEBA_R1.ab1    contig1
geneEBA_R2.ab1    contig2
geneEBA_R3.ab1    contig3
geneEBA_R4.ab1    contig4
geneEBA_R5.ab1    contig5
geneEBA_R6.ab1    contig6

Currently your regex won't do this, because your filenames don't match the format required (where the first part of the filename defines the contig, and the last part the F or R designation). So, you have two options here.

First, you could rename your files to match the required format, e.g. rename geneEBA_F1.ab1 to geneEBA_1_F.ab1. Now, your F regex will determine that the contig for this file is geneEBA_1, and will match it correctly with the same reverse read (assuming you also rename that).

If you don't want to rename your files, you can use the csv input option as well. This is just a three-column csv file that describes which reads go into which contigs, and what the read directions are.

Details of both methods are here: https://sangeranalyser.readthedocs.io/en/latest/content/beginner.html#step-1-preparing-your-input-files

Of course, if I have totally missed the point here, I'm sorry! Please explain more and we'll help fix the issue. If you can send us your files off-list, that would also help.

Rob

roblanf commented 4 years ago

related, I think it would be useful for everyone if we do this: https://github.com/roblanf/sangeranalyseR/issues/50

kevin-wamae commented 4 years ago

Thanks @roblanf, it works now.

I think going forward I'll stick to this https://github.com/roblanf/sangeranalyseR/issues/50