YeoLab / skipper

Skip the peaks and expose RNA-binding in CLIP data
Other
8 stars 3 forks source link

Execution halted in fit_clip_betabinom.R #13

Closed byee4 closed 1 year ago

byee4 commented 1 year ago
Rscript --vanilla skipper/bb63a25/bin/skipper/tools/fit_clip_betabinom.R output/counts/genome/tables/HNRNPA1_HepG2_ENCSR769UEW.tsv.gz HNRNPA1_HepG2_ENCSR769UEW HNRNPA1_HepG2_ENCSR769UEW_IP_1
── Attaching core tidyverse packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1
── Conflicts ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning messages:
1: package ‘tidyverse’ was built under R version 4.2.3
2: package ‘ggplot2’ was built under R version 4.2.3
3: package ‘tibble’ was built under R version 4.2.3
4: package ‘tidyr’ was built under R version 4.2.3
5: package ‘readr’ was built under R version 4.2.3
6: package ‘purrr’ was built under R version 4.2.3
7: package ‘dplyr’ was built under R version 4.2.3
8: package ‘stringr’ was built under R version 4.2.3
9: package ‘forcats’ was built under R version 4.2.3
10: package ‘lubridate’ was built under R version 4.2.3
Rows: 8156422 Columns: 10
── Column specification ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Delimiter: "\t"
chr (2): chr, strand
dbl (8): start, end, name, score, gc, HNRNPA1_HepG2_ENCSR769UEW_IN_1, HNRNPA...

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
Joining with `by = join_by(gc_bin)`
VGLM    linear loop  1 :  loglikelihood = -3610964.73
VGLM    linear loop  2 :  loglikelihood = -3595416.895
Applying Greenstadt modification to 2483989 matrices
VGLM    linear loop  3 :  loglikelihood = -3595416.895
Error in vglm.fitter(x = x, y = y, w = w, offset = offset, Xm2 = Xm2,  : 
  vglm() only handles full-rank models (currently)
Calls: %>% ... list2 -> lapply -> FUN -> <Anonymous> -> vglm.fitter
In addition: Warning message:
In checkwz(wz, M = M, trace = trace, wzepsilon = control$wzepsilon) :
  4967978 diagonal elements of the working weights variable 'wz' have been replaced by 1.819e-12
Execution halted

Is this a problem with the software versions installed, or the dataset?

> library(tidyverse)
── Attaching core tidyverse packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.1.2     ✔ readr     2.1.4
✔ forcats   1.0.0     ✔ stringr   1.5.0
✔ ggplot2   3.4.2     ✔ tibble    3.2.1
✔ lubridate 1.9.2     ✔ tidyr     1.3.0
✔ purrr     1.0.1     
── Conflicts ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Warning messages:
1: package ‘tidyverse’ was built under R version 4.2.3 
2: package ‘ggplot2’ was built under R version 4.2.3 
3: package ‘tibble’ was built under R version 4.2.3 
4: package ‘tidyr’ was built under R version 4.2.3 
5: package ‘readr’ was built under R version 4.2.3 
6: package ‘purrr’ was built under R version 4.2.3 
7: package ‘dplyr’ was built under R version 4.2.3 
8: package ‘stringr’ was built under R version 4.2.3 
9: package ‘forcats’ was built under R version 4.2.3 
10: package ‘lubridate’ was built under R version 4.2.3 
> sessionInfo()
R version 4.2.2 (2022-10-31)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: envs/skipper-bb63a25/lib/libopenblasp-r0.3.23.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] lubridate_1.9.2 forcats_1.0.0   stringr_1.5.0   dplyr_1.1.2    
 [5] purrr_1.0.1     readr_2.1.4     tidyr_1.3.0     tibble_3.2.1   
 [9] ggplot2_3.4.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] magrittr_2.0.3   hms_1.1.3        tidyselect_1.2.0 munsell_0.5.0   
 [5] timechange_0.2.0 colorspace_2.1-0 R6_2.5.1         rlang_1.1.1     
 [9] fansi_1.0.4      tools_4.2.2      grid_4.2.2       gtable_0.3.3    
[13] utf8_1.2.3       cli_3.6.1        withr_2.5.0      lifecycle_1.0.3 
[17] tzdb_0.4.0       vctrs_0.6.3      glue_1.6.2       stringi_1.7.12  
[21] compiler_4.2.2   pillar_1.9.0     generics_0.1.3   scales_1.2.1    
[25] pkgconfig_2.0.3 
> 
byee4 commented 1 year ago

In case this is relevant, I'm using a modified Snakefile that is trying to process ENCODE3 data from BAM: Skipper_snakefile_bam.txt

augustboyle commented 1 year ago

Something is wrong with the data or settings. The matrix is not full rank which means there are exact duplicates of counts (eg all zeros or the same fastq twice).Courtesy of my phoneOn Jul 10, 2023, at 9:53 AM, Brian Yee @.***> wrote: In case this is relevant, I'm using a modified Snakefile that is trying to process ENCODE3 data from BAM: Skipper_snakefile_bam.txt

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you are subscribed to this thread.Message ID: @.***>

byee4 commented 1 year ago

You're right- I had an error in my script and had one BAM file defined in two rows. Thanks!