roblanf / minion_qc

Quality control for MinION sequencing data
MIT License
209 stars 42 forks source link

Length vs. Q and channels epb plots empty #8

Open ghost opened 7 years ago

ghost commented 7 years ago

I ran the script on a recent dataset and mostly worked fine except that the two plots length_vs_q.png and flowcell_channels_epb.png don't have anything plotted on them. length_vs_q flowcell_channels_epb

What could be the issue?

Here's also the Albacore summary file if you want to give it a try: https://owncloud.tuebingen.mpg.de/index.php/s/VvbpGIYuCqHMfv8

Thx!

roblanf commented 7 years ago

Sorry for the radio silence - I've been away. Taking a look now.

roblanf commented 7 years ago

Not sure what the problem was, but I downloaded your file and it seems to work with the latest update. Note that you need a couple more dependencies - details at the top of the readme.

ghost commented 7 years ago

I reran it, but these two plots still are empty for me. In addition, the new corresponding plots *_per_hour.png also don't look how they should (I suppose).

length_by_hour q_by_hour

When I was running the script now, this also popped up:

Loading required package: viridisLite
Attaching package: ‘data.table’
The following objects are masked from ‘package:reshape2’:
    dcast, melt

What version of each package are you running? Also what R version? R-3.3.1 ggplot2 2.2.1 viridis 0.4.0 reshape2 1.4.1 plyr 1.8.4 ggjoy 0.2.0 purrr 0.2.2.2

roblanf commented 7 years ago

From the issues you're getting it really looks like a difference in the R versions. I built and tested this (on Linux and OSX) with R 3.4.0.

My best suggestion is to update R and reinstall the relevant packages, then try again.

On 28 July 2017 at 20:32, Dino Jolic notifications@github.com wrote:

I reran it, but these two plots still are empty for me. In addition, the new plots corresponding plots *_per_hour.png also don't look how they should (I suppose).

[image: length_by_hour] https://user-images.githubusercontent.com/8330725/28713443-60915f5e-738f-11e7-9f9d-753680818c32.png [image: q_by_hour] https://user-images.githubusercontent.com/8330725/28713444-609521f2-738f-11e7-9128-b67c3cb7f01d.png

When I was running the script now, this also popped up:

Loading required package: viridisLite Attaching package: ‘data.table’ The following objects are masked from ‘package:reshape2’: dcast, melt

What version of each package are you running? Also what R version? R-3.3.1 ggplot2 2.2.1 viridis 0.4.0 reshape2 1.4.1 plyr 1.8.4 ggjoy 0.2.0 purrr 0.2.2.2

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/roblanf/minion_qc/issues/8#issuecomment-318619255, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2pE9fA5-bbghiRX9UTv_JeUszdpN_1ks5sSbiwgaJpZM4OUE3e .

-- Rob Lanfear Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra

www.robertlanfear.com

ghost commented 7 years ago

I've just done a clean local Linux installation of R 3.4.1 from source and all needed packages, but the aforementioned plots still look the same or, in this case, empty.

Interestingly, I also have R 3.3.1 installed (from a .deb package) on this machine and when I used it the script worked fine. I just tried it on a system wide installation of R 3.3.1 on a cluster and the the problematic plots are empty again.

Either the script is using a recent version of a package, which I just happen to have on my local installation, or the installation of R from source and the base package is missing something which the other R distributions have...

roblanf commented 7 years ago

hmmm. Sorry it's a pain, and that I don't have any good solutions. If you come across anything that might help out, let me know.

On 1 August 2017 at 01:13, Dino Jolic notifications@github.com wrote:

I've just did a clean local Linux installation of R 3.4.1 from source and all needed packages, but the aforementioned plots still look the same or, in this case, empty.

Interestingly, I also have R 3.3.1 installed (from a .deb package) on this machine and when I used it the script worked fine. I just tried it on a system wide installation of R 3.3.1 on a cluster and the the problematic plots are empty again.

Either the script is using a recent version of a package, which I just happen to have on my local installation, or the installation of R from source and the base package is missing something...

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/roblanf/minion_qc/issues/8#issuecomment-319097843, or mute the thread https://github.com/notifications/unsubscribe-auth/AA2pE1nft1mvdS3kO6WaXPH6CTBLf41kks5sTe8WgaJpZM4OUE3e .

-- Rob Lanfear Division of Ecology and Evolution, Research School of Biology, The Australian National University, Canberra

www.robertlanfear.com

wdecoster commented 7 years ago

I'm not sure how to do this in R, but perhaps your script can log the packages used, their paths and versions? Could help to track which package causes the difference.

ghost commented 7 years ago

@roblanf Can you list the versions of your packages I mentioned above?

@wdecoster I added sessionInfo() to the script (which some argue might not be the most accurate way to do this, but whatever).

"Good"

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=de_DE.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=de_DE.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=de_DE.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=de_DE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  base     

other attached packages:
[1] **data.table_1.10.4** scales_0.4.1      **ggjoy_0.3.0**       yaml_2.1.14      
[5] **reshape2_1.4**      plyr_1.8.4        viridis_0.4.0     viridisLite_0.2.0
[9] ggplot2_2.2.1    

loaded via a namespace (and not attached):
 [1] **Rcpp_0.12.10**     **digest_0.6.9**     grid_3.3.1       **gtable_0.1.2**    
 [5] magrittr_1.5     **stringi_1.0-1**    lazyeval_0.2.0   labeling_0.3    
 [9] tools_3.3.1      stringr_1.0.0    purrr_0.2.2.2    **munsell_0.4.2**   
[13] **colorspace_1.2-4** methods_3.3.1    gridExtra_2.2.1  **tibble_1.3.0**    

"Bad"

R version 3.3.1 (2016-06-21)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.2 LTS

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  base     

other attached packages:
[1] **data.table_1.9.6**  scales_0.4.1      **ggjoy_0.2.0**       yaml_2.1.14      
[5] **reshape2_1.4.1**    plyr_1.8.4        viridis_0.4.0     viridisLite_0.2.0
[9] ggplot2_2.2.1    

loaded via a namespace (and not attached):
 [1] **Rcpp_0.12.6**      **digest_0.6.10**    **assertthat_0.1**   **chron_2.3-47**    
 [5] grid_3.3.1       **gtable_0.2.0**     magrittr_1.5     **stringi_1.1.1**   
 [9] lazyeval_0.2.0   labeling_0.3     tools_3.3.1      stringr_1.0.0   
[13] purrr_0.2.2.2    **munsell_0.4.3**    **colorspace_1.2-6** methods_3.3.1   
[17] gridExtra_2.2.1  **tibble_1.1** 

There is a version difference in quite a few packages (even some extra ones loaded in the "bad" R) though with some of the "bad" R packages being even more recent than the "good" R but I guess it's about pinning the right one down.

roblanf commented 7 years ago

Thanks all for the tips and ideas. Here's my session info from my desktop mac (works fine):

R version 3.3.2 (2016-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X El Capitan 10.11.6

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  base     

other attached packages:
[1] data.table_1.10.4 scales_0.4.1      ggjoy_0.2.0       yaml_2.1.14      
[5] reshape2_1.4.2    plyr_1.8.4        viridis_0.4.0     viridisLite_0.2.0
[9] ggplot2_2.2.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12     grid_3.3.2       gtable_0.2.0     magrittr_1.5    
 [5] rlang_0.1.1      stringi_1.1.5    lazyeval_0.2.0   tools_3.3.2     
 [9] stringr_1.2.0    munsell_0.4.3    colorspace_1.3-2 methods_3.3.2   
[13] gridExtra_2.2.1  tibble_1.3.3    
roblanf commented 7 years ago

My bet so far - data.table. @DNAsaurus you have 1.9.6 in your 'bad' version, and in your good version and my good version we have 1.10.4

roblanf commented 7 years ago

And on my server, the plots work fine. Note though that I just reinstalled the latest R and all the latest packages.

At least to me (obviously not done until @DNAsaurus can confirm empirically on their own system), this starts to suggest that getting all the latest packages might address this issue. Of course, it would be nice to know which package is the issue.

@DNAsaurus, maybe you could update your packages one-by-one on the 'bad' system, and see which update fixes the issue (if indeed this approach fixes it at all).

R version 3.4.1 (2017-06-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_AU.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_AU.UTF-8        LC_COLLATE=en_AU.UTF-8    
 [5] LC_MONETARY=en_AU.UTF-8    LC_MESSAGES=en_AU.UTF-8   
 [7] LC_PAPER=en_AU.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  base     

other attached packages:
[1] data.table_1.10.4 scales_0.4.1      ggjoy_0.3.0       yaml_2.1.14      
[5] reshape2_1.4.2    plyr_1.8.4        viridis_0.4.0     viridisLite_0.2.0
[9] ggplot2_2.2.1    

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.12     grid_3.4.1       gtable_0.2.0     magrittr_1.5    
 [5] rlang_0.1.1      stringi_1.1.5    lazyeval_0.2.0   tools_3.4.1     
 [9] stringr_1.2.0    purrr_0.2.2.2    munsell_0.4.3    compiler_3.4.1  
[13] colorspace_1.3-2 methods_3.4.1    gridExtra_2.2.1  tibble_1.3.3    
ghost commented 7 years ago

I updated the packages one by one, at least to the version which works in "good" R where the plots are made correctly, or higher but there's no difference. The plots are still wrong with "bad" R...

The only other reason I can think of is that it might have to do something with those extra packages being called in "bad" R for whatever reason - assertthat_0.1 chron_2.3-47 I don't know what these exactly do or how they could be interferring.

If it's not this then there's something missing in the base packages because it doesn't make sense otherwise - the "bad" R now has all the packages which are at least the same or higher version (as yours). This would lead me to believe that it's not a package issue.

roblanf commented 6 years ago

Hey @DNAsaurus, did you ever figure out more about what was going on here? If you have time, I'd be interested to know how you go with the latest version.