EMBL-Hentze-group / DEWSeq

R/Bioconductor package for e/iCLIP data analysis
5 stars 1 forks source link

function not found #4

Closed duanjunling closed 5 months ago

duanjunling commented 2 years ago

Hello! After I run the code in your R package, I get the following error: Error in DESeqDataSetFromSlidingWindows(countData = count_matrix, colData = col_data, : There is no "DESeqDataSetFromSlidingWindows" function. The source code is as follows: ddw <- DESeqDataSetFromSlidingWindows(countData=count_matrix, colData=col_data, annotObj=annotation_file, design=~-type)

How should this problem be solved? Looking forward to your answer! grateful!

duanjunling commented 2 years ago

I tried using help(DESeqDataSetFromSlidingWindows) to find the package where this function is located, but didn't find it.

tschwarzl commented 2 years ago

Daer Duanjunling,

Can you please confirm that you have successfully loaded the DEWSeq library with require(DEWSeq) before you use DESeqDataSetFromSlidingWindows.

Thank you very much.

duanjunling commented 2 years ago

Thank you very much for your reply! The problem is solved. I did not install the DEWSeq package successfully.

tschwarzl commented 2 years ago

You are very welcome. Good luck with the analysis.

duanjunling commented 2 years ago

Hello! I'm trying to run the data from your document and the following problem arises during processing. The running code and warning message are as follows:

resultRegions <- extractRegions(windowRes=resultWindows,padjCol="p_adj_IHW", padjThresh=0.05, log2FoldChangeThresh=1) %>% as_tibble Warning messages: 1: In extractRegions(windowRes = resultWindows, padjCol = "p_adj_IHW", : windowRes is a data.table or tibble object, converting it to data.frame 2: In extractRegions(windowRes = resultWindows, padjCol = "p_adj_IHW", : There are no significant windows/regions under the current threshold! Please lower your significance cut-off thresholds and manually check if there are any significant windows under the threshold.

Sincerely look forward to your answer! Thanks again! a beginner

sudeepsahadevan commented 2 years ago

Hi @duanjunling,

You don't need to worry about the first warning, its just a format conversion warning.

The second warning warning says that under the current thresholds: ie padjThresh=0.05 and log2FoldChangeThresh=1 there are no windows enriched in the dataset. The simplest answer for this would be to lower the thresholds, something like: padjThresh=0.1 and log2FoldChangeThresh=1, but this is just a quick fix, and the warning might also be hinting the noise levels in the dataset you are working with. To give you a bit clearer explanation, please explain briefly whether you are using an existing dataset or an in-house dataset of your own and the all steps that you've run previously till you reached here: resultRegions <- extractRegions(windowRes=resultWindows,padjCol="p_adj_IHW", padjThresh=0.05, log2FoldChangeThresh=1)

duanjunling commented 2 years ago

Hello! Thanks a lot for your answer! After modifying the threshold, the "warning message" is still not resolved. The dataset I am currently dealing with is the dataset in your literature! Desperately hoping that this "warning message" can be resolved. Once again, sincerely hope to hear from you!

duanjunling commented 2 years ago

Add: The processing result of this step is empty. resultRegions <- extractRegions(windowRes=resultWindows, padjCol="p_adj_IHW", padjThresh=0.1, log2FoldChangeThresh=1) %>% as_tibble So this warning worries me!

tschwarzl commented 2 years ago

Can you please provide us with a

sessionInfo()

we would like to check the versions of the packages installed to reproduce the error.

Thank you very much

duanjunling commented 2 years ago

The session information is as follows:

R version 4.1.3 (2022-03-10) Platform: x86_64-conda-linux-gnu (64-bit) Running under: Ubuntu 20.04.3 LTS

Matrix products: default BLAS/LAPACK: /home/li/anwser/jl/miniconda3/envs/htseq-clip/lib/libopenblasp-r0.3.20.so

locale: [1] LC_CTYPE=zh_CN.UTF-8 LC_NUMERIC=C
[3] LC_TIME=zh_CN.UTF-8 LC_COLLATE=zh_CN.UTF-8
[5] LC_MONETARY=zh_CN.UTF-8 LC_MESSAGES=zh_CN.UTF-8
[7] LC_PAPER=zh_CN.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=zh_CN.UTF-8 LC_IDENTIFICATION=C

attached base packages: [1] stats4 stats graphics grDevices utils datasets methods
[8] base

other attached packages: [1] DEWSeq_1.8.0 BiocParallel_1.28.3
[3] DESeq2_1.34.0 SummarizedExperiment_1.24.0 [5] Biobase_2.54.0 MatrixGenerics_1.6.0
[7] matrixStats_0.62.0 GenomicRanges_1.46.1
[9] GenomeInfoDb_1.30.1 IRanges_2.28.0
[11] S4Vectors_0.32.4 BiocGenerics_0.40.0
[13] R.utils_2.11.0 R.oo_1.24.0
[15] R.methodsS3_1.8.1 IHW_1.22.0
[17] ggrepel_0.9.1 forcats_0.5.1
[19] stringr_1.4.0 dplyr_1.0.8
[21] purrr_0.3.4 readr_2.1.2
[23] tidyr_1.2.0 tibble_3.1.6
[25] ggplot2_3.3.5 tidyverse_1.3.1
[27] data.table_1.14.2

loaded via a namespace (and not attached): [1] bitops_1.0-7 fs_1.5.2 lubridate_1.8.0
[4] bit64_4.0.5 RColorBrewer_1.1-3 httr_1.4.2
[7] tools_4.1.3 backports_1.4.1 utf8_1.2.2
[10] R6_2.5.1 DBI_1.1.2 colorspace_2.0-3
[13] withr_2.5.0 tidyselect_1.1.2 bit_4.0.4
[16] compiler_4.1.3 fdrtool_1.2.17 cli_3.2.0
[19] rvest_1.0.2 xml2_1.3.3 DelayedArray_0.20.0
[22] slam_0.1-50 scales_1.2.0 genefilter_1.76.0
[25] XVector_0.34.0 pkgconfig_2.0.3 lpsymphony_1.22.0
[28] fastmap_1.1.0 dbplyr_2.1.1 rlang_1.0.2
[31] readxl_1.4.0 rstudioapi_0.13 RSQLite_2.2.12
[34] generics_0.1.2 jsonlite_1.8.0 RCurl_1.98-1.6
[37] magrittr_2.0.3 GenomeInfoDbData_1.2.7 Matrix_1.4-1
[40] Rcpp_1.0.8.3 munsell_0.5.0 fansi_1.0.3
[43] lifecycle_1.0.1 stringi_1.7.6 zlibbioc_1.40.0
[46] grid_4.1.3 blob_1.2.3 parallel_4.1.3
[49] crayon_1.5.1 lattice_0.20-45 splines_4.1.3
[52] Biostrings_2.62.0 haven_2.5.0 annotate_1.72.0
[55] KEGGREST_1.34.0 hms_1.1.1 locfit_1.5-9.5
[58] pillar_1.7.0 geneplotter_1.72.0 reprex_2.0.1
[61] XML_3.99-0.9 glue_1.6.2 modelr_0.1.8
[64] png_0.1-7 vctrs_0.4.1 tzdb_0.3.0
[67] cellranger_1.1.0 gtable_0.3.0 assertthat_0.2.1
[70] cachem_1.0.6 xtable_1.8-4 broom_0.8.0
[73] survival_3.3-1 AnnotationDbi_1.56.2 memoise_2.0.1
[76] ellipsis_0.3.2

Thank you again!

tschwarzl commented 2 years ago

Thank you very much, we are trying to reproduce the error and will get back to you.

sudeepsahadevan commented 2 years ago

Hi @duanjunling could you please also post the first few lines (say 1-10) from the following files:

  1. from step 2.4.7 create mapping and
  2. step 2.5.3 create count matrix ?

thank you

Distue commented 2 years ago

Thank you very much. And finally, if you could share the code lines up to the point where the problem occurs, that would be fantastic. Thanks

duanjunling @.***> schrieb am Do., 28. Apr. 2022, 08:37:

Hi! Step 2.4.7 create mapping The first 10 lines of the input result are as follows:

unique_id ENCFF218ZEI ENCFF511HSJ ENCFF879UID ENSG00000227232.5:intron0005W00156 0 0 7 ENSG00000227232.5:intron0005W00157 0 0 7 ENSG00000227232.5:intron0005W00158 0 0 6 ENSG00000238009.6:exon0007W00040 0 0 1 ENSG00000238009.6:exon0007W00041 0 0 1 ENSG00000238009.6:exon0007W00042 0 0 1 ENSG00000238009.6:exon0007W00049 0 0 1 ENSG00000238009.6:exon0007W00050 0 0 1 ENSG00000238009.6:exon0007W00051 0 0 1 ENSG00000238009.6:exon0007W00054 0 0 1

thank you

— Reply to this email directly, view it on GitHub https://github.com/EMBL-Hentze-group/DEWSeq/issues/4#issuecomment-1111803397, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAG2FZGH5W6SJ4U5IRNEO4TVHIW2NANCNFSM5UBWCMIQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

duanjunling commented 2 years ago

Hi!

Step 2.4.7 create mapping first 5 lines of the input result are as follows:

unique_id chromosome begin end strand gene_id gene_name gene_type gene_region Nr_of_region Total_nr_of_region window_number ENSG00000223972.5:exon0001W00001 chr1 11868 11918 + ENSG00000223972.5 DDX11L1 transcribed_unprocessed_pseudogene exon 1 4 1 ENSG00000223972.5:exon0001W00002 chr1 11888 11938 + ENSG00000223972.5 DDX11L1 transcribed_unprocessed_pseudogene exon 1 4 2 ENSG00000223972.5:exon0001W00003 chr1 11908 11958 + ENSG00000223972.5 DDX11L1 transcribed_unprocessed_pseudogene exon 1 4 3 ENSG00000223972.5:exon0001W00004 chr1 11928 11978 + ENSG00000223972.5 DDX11L1 transcribed_unprocessed_pseudogene exon 1 4 4 ENSG00000223972.5:exon0001W00005 chr1 11948 11998 + ENSG00000223972.5 DDX11L1 transcribed_unprocessed_pseudogene exon 1 4 5 ENSG00000223972.5:exon0001W00006 chr1 11968 12018 + ENSG00000223972.5 DDX11L1

step 2.5.3 create count matrix first 10 lines of the input result are as follows: unique_id ENCFF218ZEI ENCFF511HSJ ENCFF879UID ENSG00000227232.5:intron0005W00156 0 0 7 ENSG00000227232.5:intron0005W00157 0 0 7 ENSG00000227232.5:intron0005W00158 0 0 6 ENSG00000238009.6:exon0007W00040 0 0 1 ENSG00000238009.6:exon0007W00041 0 0 1 ENSG00000238009.6:exon0007W00042 0 0 1 ENSG00000238009.6:exon0007W00049 0 0 1 ENSG00000238009.6:exon0007W00050 0 0 1 ENSG00000238009.6:exon0007W00051 0 0 1 ENSG00000238009.6:exon0007W00054 0 0 1

thank you

sudeepsahadevan commented 2 years ago

Thank you @duanjunling the files look alright, to look at the next few steps, could you please do the steps till 2.7.4 >ddw <- DESeqDataSetFromSlidingWindows(countData=count_matrix, colData=col_data, annotObj=annotation_file, design=~type)

print this variable out by typing >ddw and post the output ?

duanjunling commented 2 years ago

Hi!

ddw class: DESeqDataSet dim: 35601 3 metadata(1): version assays(4): counts mu H cooks rownames(35601): ENSG00000225630.1:exon0001W00014 ENSG00000225630.1:exon0001W00015 ... ENSG00000210196.2:exon0001W00001 ENSG00000210196.2:exon0001W00002 rowData names(29): unique_id gene_id ... deviance maxCooks colnames(3): ENCFF218ZEI ENCFF511HSJ ENCFF879UID colData names(2): type sizeFactor

Thank you very much!

sudeepsahadevan commented 2 years ago

Thanks! Now please repeat the following steps and report the outputs that you get:

ddw <- estimateSizeFactors(ddw) ddw

ddw <- estimateDispersions(ddw, fitType='local', quiet=TRUE)

ddw <- nbinomLRT(ddw, full = ~type, reduced = ~1)

resultWindows <- resultsDEWSeq(ddw, contrast = c("type", "IP", "SMI"),tidy = TRUE)

resultWindows[,'p_adj'] <- p.adjust(resultWindows$pvalue, method="BH")

resultRegions <- extractRegions(windowRes=resultWindows, padjCol=”p_adj”, padjThresh=0.05, log2FoldChangeThresh=1) %>% as_tibble

please also post the output of dim(resultRegions)

duanjunling commented 2 years ago

Hello! Thanks a lot for your answer! I tried the script you gave and got the following error:

resultRegions <- extractRegions(windowRes=resultWindows, padjCol="p_adj", padjThresh=0.05, log2FoldChangeThresh=1) %>% as_tibble Warning message: In extractRegions(windowRes = resultWindows, padjCol = "p_adj", : There are no significant windows/regions under the current threshold! Please lower your significance cut-off thresholds and manually check if there are any significant windows under the threshold

sudeepsahadevan commented 2 years ago

Hi, Right now I'm not entirely sure where the error in your analysis is coming from. You can try using the parameterized Rmarkdown available here and check whether this reproduces the error. Thank you