Shians / NanoMethViz

Apache License 2.0
21 stars 2 forks source link

Megalodon per_read_text modified_bases output not being recognized #5

Open dipannita-g opened 3 years ago

dipannita-g commented 3 years ago

Hi!

I wanted to plot Megalodon output. In another issue (#2), you suggested using the per_read_text modified_bases output file as an input for NanoMethViz. But when I try to make the tabix file, I still get the same error as reported previously:

Error in guess_methy_source(element$file) : Format not recognised.

How do I resolve this?

Shians commented 3 years ago

Could you copy the first few lines of the TSV produced by Megalodon? It may be that they changed the format since I implemented the import, things tend to move quickly at ONT.

dipannita-g commented 3 years ago

Hi Shians!

Output is a text file which is too large to open in text editors. Viewing the file in R Studio, I get this:

image

Shians commented 3 years ago

Could you try installing the developmental version of NanoMethViz using BiocManager::install("shians/NanoMethViz") and trying again?

dipannita-g commented 3 years ago

Yes, this has worked. It recognizes that the data is from Megalodon. I have the tabix file. Thank you for the help.

May I also ask how do we use our own gff or gbk annotation files for the analysis? My data is for bacterial genome. Do I have to manually convert my annotation file to the format you have suggested in #3 ? And then use this manual file in NanoMethViz::exons("file") ?

ChristianJP commented 3 years ago

@dipannita-g Please may i see your R code for importing megalodon data? Thanks

cstill3928 commented 2 years ago

Hi all,

I seem to be having the same issue as dipannita-g with importing my megalodon data: Error in guess_methy_source(element$file) : Format not recognised.. Any help will be greatly appreciated!

I've also tried installing the developmental version as Shians recommended. This still did not fix the issue. Below is an example of how my dataframe looks, its identical in format to dipannita-g.

Screen Shot 2021-09-10 at 11 29 50 AM

Here is my Rstudio settings:

R version 4.1.1 (2021-08-10) Platform: x86_64-apple-darwin17.0 (64-bit) Running under: macOS Catalina 10.15.7

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] NanoMethViz_1.99.4 ggplot2_3.3.5

loaded via a namespace (and not attached): [1] bitops_1.0-7 matrixStats_0.60.1 fs_1.5.0
[4] bit64_4.0.5 bsseq_1.28.0 GenomeInfoDb_1.28.4
[7] tools_4.1.1 utf8_1.2.2 R6_2.5.1
[10] irlba_2.3.3 HDF5Array_1.20.0 DBI_1.1.1
[13] BiocGenerics_0.38.0 colorspace_2.0-2 permute_0.9-5
[16] rhdf5filters_1.4.0 withr_2.4.2 tidyselect_1.1.1
[19] bit_4.0.4 compiler_4.1.1 Biobase_2.52.0
[22] DelayedArray_0.18.0 rtracklayer_1.52.1 scales_1.1.1
[25] readr_2.0.1 stringr_1.4.0 Rsamtools_2.8.0
[28] R.utils_2.10.1 XVector_0.32.0 pkgconfig_2.0.3
[31] scico_1.2.0 sparseMatrixStats_1.4.2 MatrixGenerics_1.4.3
[34] fastmap_1.1.0 limma_3.48.3 BSgenome_1.60.0
[37] rlang_0.4.11 rstudioapi_0.13 RSQLite_2.2.8
[40] DelayedMatrixStats_1.14.3 BiocIO_1.2.0 generics_0.1.0
[43] vroom_1.5.4 BiocParallel_1.26.2 gtools_3.9.2
[46] dplyr_1.0.7 R.oo_1.24.0 RCurl_1.98-1.4
[49] magrittr_2.0.1 BiocSingular_1.8.1 GenomeInfoDbData_1.2.6
[52] patchwork_1.1.1 Matrix_1.3-4 Rcpp_1.0.7
[55] munsell_0.5.0 S4Vectors_0.30.0 Rhdf5lib_1.14.2
[58] fansi_0.5.0 lifecycle_1.0.0 R.methodsS3_1.8.1
[61] stringi_1.7.4 yaml_2.2.1 SummarizedExperiment_1.22.0 [64] zlibbioc_1.38.0 rhdf5_2.36.0 grid_4.1.1
[67] blob_1.2.2 parallel_4.1.1 forcats_0.5.1
[70] crayon_1.4.1 lattice_0.20-44 Biostrings_2.60.2
[73] beachmat_2.8.1 hms_1.1.0 locfit_1.5-9.4
[76] pillar_1.6.2 GenomicRanges_1.44.0 rjson_0.2.20
[79] ScaledMatrix_1.0.0 stats4_4.1.1 XML_3.99-0.7
[82] glue_1.4.2 data.table_1.14.0 vctrs_0.3.8
[85] tzdb_0.1.2 tidyr_1.1.3 gtable_0.3.0
[88] purrr_0.3.4 assertthat_0.2.1 cachem_1.0.6
[91] cpp11_0.3.1 rsvd_1.0.5 restfulr_0.0.13
[94] tibble_3.1.4 GenomicAlignments_1.28.0 memoise_2.0.0
[97] IRanges_2.26.0 ellipsis_0.3.2

cstill3928 commented 2 years ago

Hi guys,

I was able to fix this issue, the main problem is that the getmethysource function expects megalodon to spit out an 8 column matrix for the per_read_modified_base_calls.txt when you put the --write-mods-text flag. However, I found that this file was missing the motif column. After I added in that column, the create_tabix_file function worked!

Best, Chris