zwdzwd / sesame

🍪 SEnsible Step-wise Analysis of DNA MEthylation BeadChips
Other
58 stars 31 forks source link

Infinium Mouse Methylation, 'attachManifest()' does no exist #100

Open mininicovd opened 1 year ago

mininicovd commented 1 year ago

Hello ! I am working with mouse methylation beadchips (MM285) and am having trouble to annotate my probes with genetic locations. I read that I needed to use the attachManifest() function but it doesn't seem to exist, it must have been in an older version. Has it changed name ? Is there another function that I could use ?

When reading the IDAT files at the very beginning, I imported a manifest by using : mft = sesameDataGet("MM285.address")$ordering And then I input the mft object to read the IDAT files such as : readIDATpair("206975990058_R01C01", platform = "MM285", manifest = mft, controls = NULL, verbose = TRUE) But this manifest doesn't seem to contain mapping information, as column names are "Probe_ID", "M", "U", "col", "mask". I now downloaded the latest mouse manifest for Mapping information from http://zwdzwd.github.io/InfiniumAnnotation#mouse, could I use this one when using 'readIDATpair()' function to receive annotated data ?

Any help would be appreciated !

Below is my sessionInfo() :

R version 4.2.3 (2023-03-15 ucrt) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale: [1] LC_COLLATE=English_United States.utf8 [2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

attached base packages: [1] stats4 parallel stats graphics grDevices utils
[7] datasets methods base

other attached packages: [1] BiocParallel_1.32.6 minfi_1.44.0
[3] bumphunter_1.40.0 locfit_1.5-9.7
[5] iterators_1.0.14 foreach_1.5.2
[7] Biostrings_2.66.0 XVector_0.38.0
[9] SummarizedExperiment_1.28.0 Biobase_2.58.0
[11] GenomicRanges_1.50.2 GenomeInfoDb_1.34.9
[13] IRanges_2.32.0 S4Vectors_0.36.2
[15] MatrixGenerics_1.10.0 matrixStats_0.63.0
[17] pals_1.7 RPMM_1.25
[19] cluster_2.1.4 dplyr_1.1.1
[21] ggplot2_3.4.2 sesame_1.16.1
[23] sesameData_1.16.0 ExperimentHub_2.6.0
[25] AnnotationHub_3.6.0 BiocFileCache_2.6.1
[27] dbplyr_2.3.2 BiocGenerics_0.44.0
[29] DMRcate_2.12.0

loaded via a namespace (and not attached): [1] utf8_1.2.3
[2] R.utils_2.12.2
[3] tidyselect_1.2.0
[4] RSQLite_2.3.1
[5] AnnotationDbi_1.60.2
[6] htmlwidgets_1.6.2
[7] grid_4.2.3
[8] munsell_0.5.0
[9] codetools_0.2-19
[10] preprocessCore_1.60.2
[11] statmod_1.5.0
[12] interp_1.1-4
[13] withr_2.5.0
[14] colorspace_2.1-0
[15] filelock_1.0.2
[16] knitr_1.42
[17] rstudioapi_0.14
[18] GenomeInfoDbData_1.2.9
[19] bit64_4.0.5
[20] rhdf5_2.42.1
[21] vctrs_0.6.1
[22] generics_0.1.3
[23] xfun_0.38
[24] biovizBase_1.46.0
[25] R6_2.5.1
[26] illuminaio_0.40.0
[27] AnnotationFilter_1.22.0
[28] bitops_1.0-7
[29] rhdf5filters_1.10.1
[30] cachem_1.0.7
[31] reshape_0.8.9
[32] DelayedArray_0.23.2
[33] IlluminaHumanMethylation450kanno.ilmn12.hg19_0.6.1 [34] promises_1.2.0.1
[35] BiocIO_1.8.0
[36] scales_1.2.1
[37] bsseq_1.34.0
[38] nnet_7.3-18
[39] gtable_0.3.3
[40] wheatmap_0.2.0
[41] ensembldb_2.22.0
[42] rlang_1.1.0
[43] genefilter_1.80.3
[44] splines_4.2.3
[45] rtracklayer_1.58.0
[46] lazyeval_0.2.2
[47] DSS_2.46.0
[48] GEOquery_2.66.0
[49] dichromat_2.0-0.1
[50] checkmate_2.1.0
[51] reshape2_1.4.4
[52] BiocManager_1.30.20
[53] yaml_2.3.7
[54] GenomicFeatures_1.50.4
[55] backports_1.4.1
[56] httpuv_1.6.9
[57] Hmisc_5.0-1
[58] tools_4.2.3
[59] nor1mix_1.3-0
[60] ellipsis_0.3.2
[61] RColorBrewer_1.1-3
[62] siggenes_1.72.0
[63] Rcpp_1.0.10
[64] plyr_1.8.8
[65] base64enc_0.1-3
[66] sparseMatrixStats_1.10.0
[67] progress_1.2.2
[68] zlibbioc_1.44.0
[69] purrr_1.0.1
[70] RCurl_1.98-1.12
[71] prettyunits_1.1.1
[72] rpart_4.1.19
[73] openssl_2.0.6
[74] deldir_1.0-6
[75] magrittr_2.0.3
[76] data.table_1.14.8
[77] IlluminaHumanMethylationEPICanno.ilm10b4.hg19_0.6.0 [78] ProtGenerics_1.30.0
[79] missMethyl_1.32.1
[80] hms_1.1.3
[81] mime_0.12
[82] evaluate_0.20
[83] xtable_1.8-4
[84] XML_3.99-0.14
[85] jpeg_0.1-10
[86] mclust_6.0.0
[87] gridExtra_2.3
[88] compiler_4.2.3
[89] biomaRt_2.54.1
[90] maps_3.4.1
[91] tibble_3.2.1
[92] crayon_1.5.2
[93] R.oo_1.25.0
[94] htmltools_0.5.5
[95] later_1.3.0
[96] tzdb_0.3.0
[97] Formula_1.2-5
[98] tidyr_1.3.0
[99] DBI_1.1.3
[100] MASS_7.3-58.2
[101] rappdirs_0.3.3
[102] Matrix_1.5-3
[103] readr_2.1.4
[104] permute_0.9-7
[105] cli_3.6.1
[106] quadprog_1.5-8
[107] R.methodsS3_1.8.2
[108] Gviz_1.42.1
[109] pkgconfig_2.0.3
[110] GenomicAlignments_1.34.1
[111] foreign_0.8-84
[112] xml2_1.3.3
[113] annotate_1.76.0
[114] rngtools_1.5.2
[115] multtest_2.54.0
[116] beanplot_1.3.1
[117] doRNG_1.8.6
[118] scrime_1.3.5
[119] stringr_1.5.0
[120] VariantAnnotation_1.44.1
[121] digest_0.6.31
[122] rmarkdown_2.21
[123] base64_2.0.1
[124] htmlTable_2.4.1
[125] edgeR_3.40.2
[126] DelayedMatrixStats_1.20.0
[127] restfulr_0.0.15
[128] curl_5.0.0
[129] shiny_1.7.4
[130] Rsamtools_2.14.0
[131] gtools_3.9.4
[132] rjson_0.2.21
[133] lifecycle_1.0.3
[134] nlme_3.1-162
[135] Rhdf5lib_1.20.0
[136] mapproj_1.2.11
[137] askpass_1.1
[138] limma_3.54.2
[139] BSgenome_1.66.3
[140] fansi_1.0.4
[141] pillar_1.9.0
[142] lattice_0.20-45
[143] KEGGREST_1.38.0
[144] fastmap_1.1.1
[145] httr_1.4.5
[146] survival_3.5-3
[147] interactiveDisplayBase_1.36.0
[148] glue_1.6.2
[149] png_0.1-8
[150] BiocVersion_3.16.0
[151] bit_4.0.5
[152] stringi_1.7.12
[153] HDF5Array_1.26.0
[154] blob_1.2.4
[155] org.Hs.eg.db_3.16.0
[156] latticeExtra_0.6-30
[157] memoise_2.0.1

zwdzwd commented 1 year ago

The function attachManifest is removed because BioC discourages data retrieval from external hosts for obvious reasons. The "sesameAnno_get"-based function may still work, but they will all be obsolete in the future.

Please use the following code for your purposes

tsv_path <- "~/Downloads/MM285.mm10.manifest.tsv.gz"
addr <- sesameAnno_buildAddressFile(tsv_path)
openSesame(..., mft = addr)

manifest <- sesameAnno_buildManifestGRanges(tsv_path) # for mapping location annotation

Hope this helps

mininicovd commented 1 year ago

Thank you very much for your quick response, it permitted me to advance in my analysis, but I am stuck again.

I created a Ranged Summarized Experiment object with my betas matrix, my sample metadata and the GRanges extracted from the manifest, which I am not sure was useful, maybe a summarized experiment without the GRanges sufficed. I managed to apply the DML() function to obtain a DMLSummary object with contrasts "Sample_Group", "Sex" and "Generation". I then applied summaryExtractTest(smry) to obtain the results in a "tbl_df" object with all the statistical tests. I managed to investigate all questions related to these contrasts, but I am having trouble to use the genomic ranges.

For instance, I tried to Inspect the chromosomes the most sex-associated CpGs by using the code : res %>% filter(Est_SexMale > 0.1, Pval_SexMale > 0.01) %>% rownames_to_column %>% left_join(lookup_table, by = "Probe_ID") %>% with(table(seqnames)) #lookup_table is the GRanges object converted to a df
instead of using the attachManifest() function, which didn't show an error message but the results don't seem correct, with only two probes are located on the X chromosome.