Olink-Proteomics / OlinkRPackage

Olink R package: A collection of functions to facilitate analysis of proteomic data from Olink. The goal of this package is to help users extract biological insights from proteomic data run on the Olink platform.
GNU Affero General Public License v3.0
87 stars 21 forks source link

Warning while carrying out the bridge normalization between 2 datasets #389

Closed ashishjain1988 closed 3 months ago

ashishjain1988 commented 3 months ago

Describe the bug I am trying to carry out the normalization of the two datasets using through bridge normlization using olink_normalization function. While carrying out the normalization I am getting a warning

Warning message: In olink_normalization(df1 = data_Batch1, df2 = data_Batch2, overlapping_samples_df1 = bridge_samples, : There are 92 assays not normalized with the same approach. Consider renormalizing.

I got the normalized data from Olink that are normalized using IPC Normalized and Intensity Normalized (v. 2) methods. Can you please help me with integrating these datasets as we have only the normalized data in excel files provided by Olink. Also, if there any way to renormalize them together?

To Reproduce Steps to reproduce the behavior:

bridge_normalized_data <- olink_normalization(df1 = data_Batch1, df2 = data_Batch2, overlapping_samples_df1 = bridge_samples, df1_project_nr = "Batch-1", df2_project_nr = "Batch-2", reference_project = "Batch-1")

Expected behavior The function should run without any warning

Screenshots image

System Information: R version 4.3.1 (2023-06-16) Platform: aarch64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.5

Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0

locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/New_York tzcode source: internal

attached base packages: [1] grid stats4 stats graphics grDevices utils datasets methods base

other attached packages: [1] limma_3.56.2 openxlsx_4.2.5.2 ggvenn_0.1.10 dplyr_1.1.4
[5] stringr_1.5.1 devtools_2.4.5 usethis_2.2.3 EnhancedVolcano_1.18.0
[9] ggrepel_0.9.5 ggplot2_3.5.1 org.Hs.eg.db_3.17.0 AnnotationDbi_1.62.2
[13] magrittr_2.0.3 data.table_1.15.4 pheatmap_1.0.12 DESeq2_1.40.2
[17] SummarizedExperiment_1.30.2 Biobase_2.60.0 MatrixGenerics_1.12.3 matrixStats_1.3.0
[21] GenomicRanges_1.52.1 GenomeInfoDb_1.36.4 IRanges_2.34.1 S4Vectors_0.38.2
[25] BiocGenerics_0.46.0 OlinkAnalyze_3.7.0

loaded via a namespace (and not attached): [1] R.methodsS3_1.8.2 SpatialExperiment_1.10.0 coin_1.4-3 urlchecker_1.0.1
[5] goftest_1.2-3 HDF5Array_1.28.1 Biostrings_2.68.1 TH.data_1.1-2
[9] vctrs_0.6.5 spatstat.random_3.2-3 digest_0.6.35 png_0.1-8
[13] shape_1.4.6.1 plyranges_1.19.0 deldir_2.0-4 parallelly_1.37.1
[17] magick_2.8.3 MASS_7.3-60 reshape2_1.4.4 rematch_2.0.0
[21] httpuv_1.6.15 foreach_1.5.2 qvalue_2.32.0 withr_3.0.0
[25] xfun_0.43 ggfun_0.1.4 ellipsis_0.3.2 ggpubr_0.6.0
[29] survival_3.6-4 memoise_2.0.1 emmeans_1.10.2 clusterProfiler_4.8.2
[33] gson_0.1.0 profvis_0.3.8 systemfonts_1.0.6 ragg_1.3.1
[37] tidytree_0.4.6 zoo_1.8-12 GlobalOptions_0.1.2 gtools_3.9.5
[41] pbapply_1.7-2 argparse_2.2.3 R.oo_1.26.0 KEGGREST_1.40.1
[45] promises_1.3.0 httr_1.4.7 downloader_0.4 restfulr_0.0.15
[49] rstatix_0.7.2 rhdf5filters_1.13.2 globals_0.16.3 fitdistrplus_1.1-11
[53] rhdf5_2.46.0 rstudioapi_0.16.0 miniUI_0.1.1.1 generics_0.1.3
[57] DOSE_3.26.2 zlibbioc_1.46.0 ggraph_2.2.1 polyclip_1.10-6
[61] GenomeInfoDbData_1.2.10 SparseArray_1.2.2 xtable_1.8-4 doParallel_1.0.17
[65] evaluate_0.23 S4Arrays_1.2.0 hms_1.1.3 irlba_2.3.5.1
[69] colorspace_2.1-0 ROCR_1.0-11 readxl_1.4.3 reticulate_1.36.1
[73] treemap_2.4-4 spatstat.data_3.0-4 lmtest_0.9-40 readr_2.1.5
[77] later_1.3.2 viridis_0.6.5 modeltools_0.2-23 ggtree_3.8.2
[81] lattice_0.22-6 spatstat.geom_3.2-9 future.apply_1.11.2 scuttle_1.10.3
[85] XML_3.99-0.16.1 scattermore_1.2 shadowtext_0.1.3 cowplot_1.1.3
[89] RcppAnnoy_0.0.22 pillar_1.9.0 nlme_3.1-164 iterators_1.0.14
[93] beachmat_2.16.0 gridBase_0.4-7 caTools_1.18.2 compiler_4.3.1
[97] RSpectra_0.16-1 stringi_1.8.4 tensor_1.5 minqa_1.2.6
[101] GenomicAlignments_1.38.0 plyr_1.8.9 BiocIO_1.12.0 crayon_1.5.2
[105] abind_1.4-5 gridGraphics_0.5-1 locfit_1.5-9.9 sp_2.1-4
[109] graphlayouts_1.1.1 bit_4.0.5 sandwich_3.1-0 libcoin_1.0-10
[113] fastmatch_1.1-4 textshaping_0.3.7 fastcluster_1.2.6 codetools_0.2-20
[117] multcomp_1.4-25 GetoptLong_1.0.5 plotly_4.10.4 mime_0.12
[121] splines_4.3.1 circlize_0.4.16 Rcpp_1.0.12 fastDummies_1.7.3
[125] sparseMatrixStats_1.12.2 HDO.db_0.99.1 cellranger_1.1.0 knitr_1.45
[129] blob_1.2.4 utf8_1.2.4 here_1.0.1 clue_0.3-65
[133] lme4_1.1-35.3 fs_1.6.4 listenv_0.9.1 DelayedMatrixStats_1.22.6
[137] pkgbuild_1.4.4 estimability_1.5.1 ggsignif_0.6.4 ggplotify_0.1.2
[141] tibble_3.2.1 Matrix_1.6-5 tzdb_0.4.0 tweenr_2.0.3
[145] phyclust_0.1-34 pkgconfig_2.0.3 tools_4.3.1 cachem_1.0.8
[149] RSQLite_2.3.6 numDeriv_2016.8-1.1 viridisLite_0.4.2 DBI_1.2.2
[153] fastmap_1.1.1 rmarkdown_2.26 scales_1.3.0 ica_1.0-3
[157] Seurat_5.0.3 Rsamtools_2.16.0 broom_1.0.5 patchwork_1.2.0
[161] coda_0.19-4.1 dotCall64_1.1-1 carData_3.0-5 RANN_2.6.1
[165] farver_2.1.1 tidygraph_1.3.1 scatterpie_0.2.2 yaml_2.3.8
[169] rtracklayer_1.62.0 cli_3.6.2 purrr_1.0.2 leiden_0.4.3.1
[173] lifecycle_1.0.4 uwot_0.1.16 mvtnorm_1.2-4 sessioninfo_1.2.2
[177] lambda.r_1.2.4 backports_1.4.1 DropletUtils_1.20.0 BiocParallel_1.34.2
[181] gtable_0.3.5 rjson_0.2.21 ggridges_0.5.6 progressr_0.14.0
[185] parallel_4.3.1 ape_5.8 jsonlite_1.8.8 edgeR_3.42.4
[189] RcppHNSW_0.6.0 bitops_1.0-7 bit64_4.0.5 Rtsne_0.17
[193] yulab.utils_0.1.4 spatstat.utils_3.0-4 zip_2.3.1 SeuratObject_5.0.1
[197] RcppParallel_5.1.7 futile.options_1.0.1 dqrng_0.3.2 GOSemSim_2.26.1
[201] R.utils_2.12.3 lazyeval_0.2.2 shiny_1.8.1.1 htmltools_0.5.8.1
[205] enrichplot_1.20.0 GO.db_3.17.0 sctransform_0.4.1 formatR_1.14
[209] glue_1.7.0 spam_2.10-0 XVector_0.40.0 RCurl_1.98-1.14
[213] rprojroot_2.0.4 treeio_1.24.3 futile.logger_1.4.3 gridExtra_2.3
[217] boot_1.3-30 igraph_2.0.3 R6_2.5.1 tidyr_1.3.1
[221] SingleCellExperiment_1.22.0 gplots_3.1.3.1 labeling_0.4.3 forcats_1.0.0
[225] cluster_2.1.6 pkgload_1.3.4 Rhdf5lib_1.24.0 aplot_0.2.2
[229] nloptr_2.0.3 DelayedArray_0.26.7 tidyselect_1.2.1 ggforce_0.4.2
[233] car_3.1-2 future_1.33.2 munsell_0.5.1 KernSmooth_2.23-22
[237] htmlwidgets_1.6.4 fgsea_1.26.0 ComplexHeatmap_2.16.0 RColorBrewer_1.1-3
[241] rlang_1.1.3 spatstat.sparse_3.0-3 spatstat.explore_3.2-7 remotes_2.5.0
[245] lmerTest_3.1-3 fansi_1.0.6 parallelDist_0.2.6

kathy-nevola commented 3 months ago

Hi @ashishjain1988,

This warning is occurring because the two datasets that you are using have different normalization types (one is IPC Normalized and the other is Intensity Normalized) which can result in different adjustment factors (either smaller or larger). If possible you can intensity normalize the IPC normalized dataset. Alternatively you can use the function as is and just know that adjustment factors could be larger or smaller than expected and cannot be used to determine the quality of bridging. If you would like to discuss your project in particular, please reach out to Support@olink.com.

ashishjain1988 commented 3 months ago

Hi @kathy-nevola,

Thank you for the quick response and your help. Is there any function in the package that I can use to Intensity normalize the data IPC normalized data?

kathy-nevola commented 3 months ago

Hi @ashishjain1988,

We dont have a function for applying normalization within 1 dataset. I recommend checking out the special case of subset normalization detailed in olink_normalization and applying this across plates. https://rdrr.io/cran/OlinkAnalyze/man/olink_normalization.html Alternatively the method for calculating intensity normalization is detailed in this white paper: https://olink.com/application/data-normalization-and-standardization/

ashishjain1988 commented 3 months ago

Thank you for your help!

gipatelli commented 1 month ago

Dear @kathy-nevola, I am dealing with the exact same issue. My first batch was a pilot (1 plate) and was normalized using plate control. My new batch, a 2-plate project, underwent intensity normalization instead. Now I get the warning. Of course I would like to renormalize, but I am not sure on how to use the olink_normalization R command to do so. Could you be a little more specific?

Also, @ashishjain1988 I would deeply appreciate if you could share with me if you worked it out.

Thank you so much both of you, G

ashishjain1988 commented 1 month ago

Hi @gipatelli,

I emailed the vendor that ran our Olink plate and asked them to rerun the normalization on the earlier dataset.

gipatelli commented 4 weeks ago

Thank you @ashishjain1988. I guess waiting for Olink support is the only way. Bests.

kathy-nevola commented 4 weeks ago

@gipatelli, at this point intensity normalization is not possible for only one project using Olink normalizer. You can either get the data from the first package re-exported as intensity normalized or intensity normalize the data using the method detailing in our normalization white paper: (available at https://olink.com/knowledge/documents, use the search term normalization). In brief the white paper details intensity normalization as:

  1. For each assay, calculate the overall median value for all samplesand plates.
  2. For each plate and assay, calculate the plate specific median value.
  3. For each assay, subtract the plate specific median from every value for the plates (equals centralizing to median 0).

Depending on the product and when your data is from there may be an additional step:

  1. For each assay, add the overall median value (equals centralizing to the overall median).
gipatelli commented 4 weeks ago

Dear @kathy-nevola, thank you for your quick response.

I've reviewed the white paper, but I'm still unclear on how to apply intensity normalization using the method described to my case. Given that I have only one plate in my first batch, the overall median assay values are identical to the plate-specific median values. Thus, subtracting the plate-specific median from every value and then adding the overall median results in no change, as you subtract what you add.

I apologize if I've overlooked something, and I hope you can provide further clarification.

Additionally, I'm interested in having the data from the first package re-exported as intensity normalized, and I'm currently awaiting a response from Olink support. Is there any way I can move on?

Bests, thank you again