Open mattgalbraith opened 5 years ago
Thank you for the feedback. Could you show the first 10 lines of your $ANNOTATION_BED ? I suspect this might be a MAC-specific issue. Did you try any linux system?
Best,
Tinyi
On Wed, May 22, 2019 at 12:24 AM mattgalbraith notifications@github.com wrote:
When running tfTarget via run_tfTarget.bsh with the following command: bash run_tfTarget.bsh \ -query $TREATMENT_SAMPLES \ -control $CONTROL_SAMPLES \ -bigWig.path $BIGWIG_PATH \ -prefix gencode_test \ -TRE.path $TRE_MERGED_BED \ -gene.path $ANNOTATION_BED \ -2bit.path $HG19_2BIT \ -pval.up 0.1 \ -pval.down 0.1 \ -ncores 3 \ -dist 50000 \ -closest.N 2 \ -pval.gene 0.1
I am getting the following error:
[1] "associating TFs to TREs and genes" awk: syntax error at source line 1 context is BEGIN{OFS=" "} {print >>> $1,$6== <<< awk: illegal statement at source line 1 awk: illegal statement at source line 1 Error in $<-.data.frame(tmp, "closest.N", value = c(1L, 2L, 1L, : replacement has 36 rows, data has 37 Calls: mapTF -> get.proximal.genes -> $<- -> $<-.data.frame Execution halted
This appears to be related to the awk command at lines 18-20 or 43-45 of mapTF.R
R session info with tfTarget loaded:
R version 3.5.1 (2018-07-02) Platform: x86_64-apple-darwin15.6.0 (64-bit) Running under: macOS 10.14.4
Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] tfTarget_1.0
loaded via a namespace (and not attached): [1] bitops_1.0-6 matrixStats_0.54.0 rtfbsdb_0.4.5 [4] bit64_0.9-7 RColorBrewer_1.1-2 GenomeInfoDb_1.18.1 [7] tools_3.5.1 backports_1.1.3 R6_2.3.0 [10] KernSmooth_2.23-15 rpart_4.1-13 sm_2.2-5.4 [13] Hmisc_4.1-1 DBI_1.0.0 lazyeval_0.2.1 [16] BiocGenerics_0.28.0 colorspace_1.3-2 nnet_7.3-12 [19] tidyselect_0.2.5 gridExtra_2.3 DESeq2_1.22.1 [22] bit_1.1-14 compiler_3.5.1 Biobase_2.42.0 [25] htmlTable_1.12 DelayedArray_0.8.0 rphast_1.6.9 [28] caTools_1.17.1.1 scales_1.0.0 checkmate_1.8.5 [31] genefilter_1.64.0 stringr_1.3.1 apcluster_1.4.7 [34] digest_0.6.18 foreign_0.8-71 XVector_0.22.0 [37] vioplot_0.3.0 base64enc_0.1-3 pkgconfig_2.0.2 [40] htmltools_0.3.6 htmlwidgets_1.3 rlang_0.3.0.1 [43] rstudioapi_0.8 RSQLite_2.1.1 bindr_0.1.1 [46] zoo_1.8-5 BiocParallel_1.16.5 bigWig_0.2-9 [49] gtools_3.8.1 acepack_1.4.1 dplyr_0.7.8 [52] RCurl_1.95-4.11 magrittr_1.5 GenomeInfoDbData_1.2.0 [55] Formula_1.2-3 Matrix_1.2-15 Rcpp_1.0.0 [58] munsell_0.5.0 S4Vectors_0.20.1 stringi_1.2.4 [61] yaml_2.2.0 rtfbs_0.3.9 SummarizedExperiment_1.12.0 [64] zlibbioc_1.28.0 gplots_3.0.1 plyr_1.8.4 [67] grid_3.5.1 blob_1.1.1 gdata_2.18.0 [70] parallel_3.5.1 crayon_1.3.4 lattice_0.20-38 [73] splines_3.5.1 annotate_1.60.0 locfit_1.5-9.1 [76] knitr_1.21 pillar_1.3.0 GenomicRanges_1.34.0 [79] geneplotter_1.60.0 stats4_3.5.1 XML_3.98-1.16 [82] glue_1.3.0 latticeExtra_0.6-28 data.table_1.11.8 [85] gtable_0.2.0 purrr_0.2.5 assertthat_0.2.0 [88] ggplot2_3.1.0 xfun_0.4 xtable_1.8-3 [91] survival_2.43-3 tibble_1.4.2 AnnotationDbi_1.44.0 [94] memoise_1.1.0 IRanges_2.16.0 bindrcpp_0.2.2 [97] cluster_2.0.7-1
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/tfTarget/issues/1?email_source=notifications&email_token=AB4NHSY3LA27IBLHEKIYMFTPWTDHFA5CNFSM4HOQR3E2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4GVDR3PQ, or mute the thread https://github.com/notifications/unsubscribe-auth/AB4NHS3Z3JHM75YUNKTWHZDPWTDHFANCNFSM4HOQR3EQ .
head ~/Refs/hg19/gencode.v19.annotation.bed
chr1 11868 14412 ENSG00000223972.4 DDX11L1 +
chr1 14362 29806 ENSG00000227232.4 WASH7P -
chr1 29553 31109 ENSG00000243485.2 MIR1302-11 +
chr1 34553 36081 ENSG00000237613.2 FAM138A -
chr1 52472 54936 ENSG00000268020.2 OR4G4P +
chr1 62947 63887 ENSG00000240361.1 OR4G11P +
chr1 69090 70008 ENSG00000186092.4 OR4F5 +
chr1 89294 133566 ENSG00000238009.2 RP11-34P13.7 -
chr1 89550 91105 ENSG00000239945.1 RP11-34P13.8 -
chr1 131024 134836 ENSG00000233750.3 CICP27 +
I was unable to successfully get all the R dependencies installed on our linux system, hence using the Mac.
I have now managed to get tfTarget and all dependencies running on linux and no longer get the awk error. However, I am now getting a new error:
[1] "associating TFs to TREs and genes" Error in names(x) <- value : 'names' attribute [27] must be the same length as the vector [16] Calls: mapTF -> colnames<- Execution halted
From looking into the mapTF function, it appears that
TF.TRE.gene.tab.short <- TF.TRE.gene.tab[, -c(1, 13:15)]
is generating a data frame with only 16 columns rather than the 27 suggested by
header.vec <- c("tre.chrom", "tre.chromStart", "tre.chromEnd", "tf.chrom", "tf.chromStart", "tf.chromEnd", "score", "strand", "motif.name", "motif.id", "motif.idx", "TRE.baseMean", "TRE.log2FoldChange", "TRE.pvalue", "TRE.padj", "gene.TSS.chr", "gene.TSS.start", "gene.TSS.end", "transcript.id", "gene.name", "gene.strand", "gene.baseMean", "gene.log2FoldChange", "gene.pvalue", "gene.padj", "distance")
if (!is.null(closest.N)) header.vec <- c(header.vec, "closest.N")
colnames(TF.TRE.gene.tab.short) <- header.vec
I will try running the R commands manually to see if I can track this down any further...
For reference: The last error was caused by an empty TF.TRE.gene.tab object due to the stringency of settings used.
Thank you for your feedback. I guess this is caused by the lack of statistical power (determined by DESeq2) where fewer than 2 replicates were used for each condition.
On Tue, May 28, 2019 at 12:42 PM mattgalbraith notifications@github.com wrote:
For reference: The last error was caused by an empty TF.TRE.gene.tab object due to the stringency of settings used.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/Danko-Lab/tfTarget/issues/1?email_source=notifications&email_token=AB4NHSYJUPVFFJZOCTIPG73PXVOF7A5CNFSM4HOQR3E2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWMXERA#issuecomment-496595524, or mute the thread https://github.com/notifications/unsubscribe-auth/AB4NHSY5RKT4BLER2TVZVZDPXVOF7ANCNFSM4HOQR3EQ .
I have now managed to get tfTarget and all dependencies running on linux and no longer get the awk error. However, I am now getting a new error:
[1] "associating TFs to TREs and genes" Error in names(x) <- value : 'names' attribute [27] must be the same length as the vector [16] Calls: mapTF -> colnames<- Execution halted
From looking into the mapTF function, it appears that
TF.TRE.gene.tab.short <- TF.TRE.gene.tab[, -c(1, 13:15)]
is generating a data frame with only 16 columns rather than the 27 suggested byheader.vec <- c("tre.chrom", "tre.chromStart", "tre.chromEnd", "tf.chrom", "tf.chromStart", "tf.chromEnd", "score", "strand", "motif.name", "motif.id", "motif.idx", "TRE.baseMean", "TRE.log2FoldChange", "TRE.pvalue", "TRE.padj", "gene.TSS.chr", "gene.TSS.start", "gene.TSS.end", "transcript.id", "gene.name", "gene.strand", "gene.baseMean", "gene.log2FoldChange", "gene.pvalue", "gene.padj", "distance")
if (!is.null(closest.N)) header.vec <- c(header.vec, "closest.N")
colnames(TF.TRE.gene.tab.short) <- header.vec
I will try running the R commands manually to see if I can track this down any further...
I am having the same issue. I wonder if your manual solution did work.
Best regards
When running tfTarget via run_tfTarget.bsh with the following command:
bash run_tfTarget.bsh \ -query $TREATMENT_SAMPLES \ -control $CONTROL_SAMPLES \ -bigWig.path $BIGWIG_PATH \ -prefix gencode_test \ -TRE.path $TRE_MERGED_BED \ -gene.path $ANNOTATION_BED \ -2bit.path $HG19_2BIT \ -pval.up 0.1 \ -pval.down 0.1 \ -ncores 3 \ -dist 50000 \ -closest.N 2 \ -pval.gene 0.1
I am getting the following error:
This appears to be related to the awk command at lines 18-20 or 43-45 of mapTF.R
R session info with tfTarget loaded: