mtandon09 / CCBR_GATK4_Exome_Seq_Pipeline

An easy-to-use, flexible variant calling pipeline for use on the Biowulf cluster at NIH
https://mtandon09.github.io/CCBR_GATK4_Exome_Seq_Pipeline/
MIT License
4 stars 3 forks source link

Update Dockerfile #27

Closed dnousome closed 2 years ago

dnousome commented 2 years ago

Freec script requires GenomicRanges R package to run

skchronicles commented 2 years ago

Hey @dnousome,

I think rtracklayer installs GenomicRanges as a dependency. I would need to double check this tomorrow, but are you getting an error about the package being missing?

skchronicles commented 2 years ago

Hey @dnousome,

I just had some time to check if GenomicRanges is installed in the image.

It is already installed:

$ docker run -ti nciccbr/ccbr_wes_base:v0.1.0 /bin/bash
root@35c7ab1b88ce:/data2# R

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(GenomicRanges)
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    I, expand.grid, unname

Loading required package: IRanges
Loading required package: GenomeInfoDb
dnousome commented 2 years ago

I was having issues with freec loading the package during the assess_significance.R script which didn't run because of the dependency. May be due to the envmodule part instead of the docker. Will take a look again.

On Wed, Dec 22, 2021, 13:45 Skyler Kuhn @.***> wrote:

Hey @dnousome https://github.com/dnousome,

I just had some time to check if GenomicRanges is installed in the image.

It is already installed:

$ docker run -ti nciccbr/ccbr_wes_base:v0.1.0 /bin/bash @.***:/data2# R

R version 4.1.2 (2021-11-01) -- "Bird Hippie" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R.

library(GenomicRanges) Loading required package: stats4 Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

Filter, Find, Map, Position, Reduce, anyDuplicated, append,
as.data.frame, basename, cbind, colnames, dirname, do.call,
duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

I, expand.grid, unname

Loading required package: IRanges Loading required package: GenomeInfoDb

— Reply to this email directly, view it on GitHub https://github.com/mtandon09/CCBR_GATK4_Exome_Seq_Pipeline/pull/27#issuecomment-999789286, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACRZF2FQ4QABTQ3Y5IR3RXDUSIMDTANCNFSM5KN4XD2A . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

skchronicles commented 2 years ago

Okay, sounds good @dnousome.

Yeah, right now the entire pipeline is using docker images for every step. If you want, I can take a look at what you are doing. I just checked and it also looks like the rtracklayer is loading GenomicRanges correctly under the hood:

$  docker run -ti nciccbr/ccbr_wes_base:v0.1.0 /bin/bash
root@ba3b3e90ac70:/data2# R

R version 4.1.2 (2021-11-01) -- "Bird Hippie"
Copyright (C) 2021 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(rtracklayer)
Loading required package: GenomicRanges
Loading required package: stats4
Loading required package: BiocGenerics

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colnames, dirname, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, mapply, match, mget, order, paste, pmax, pmax.int, pmin,
    pmin.int, rank, rbind, rownames, sapply, setdiff, sort, table,
    tapply, union, unique, unsplit, which.max, which.min

Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following objects are masked from 'package:base':

    I, expand.grid, unname

Loading required package: IRanges
Loading required package: GenomeInfoDb
> help(GRanges)
> GRanges 
function (seqnames = NULL, ranges = NULL, strand = NULL, ..., 
    seqinfo = NULL, seqlengths = NULL) 
{
    mcols <- DataFrame(..., check.names = FALSE)
    if (!is.null(ranges)) {
        ranges <- as(ranges, "IRanges")
    }
    else if (is.null(seqnames)) {
        ranges <- IRanges()
    }
    else {
        x <- as(seqnames, "GRanges")
        seqnames <- x@seqnames
        ranges <- x@ranges
        if (is.null(strand)) 
            strand <- x@strand
        if (length(mcols) == 0L) 
            mcols <- mcols(x, use.names = FALSE)
        if (is.null(seqinfo)) 
            seqinfo <- seqinfo(x)
    }
    seqinfo <- normarg_seqinfo2(seqinfo, seqlengths)
    ans <- new_GRanges("GRanges", seqnames = seqnames, ranges = ranges, 
        strand = strand, mcols = mcols, seqinfo = seqinfo)
    validObject(ans)
    ans
}
<bytecode: 0x5558dca86b68>
<environment: namespace:GenomicRanges>
> 
dnousome commented 2 years ago

@skchronicles, sorry for the delay. Looks like everything is installed correctly when the docker image is pulled.

I think there must have been an R module still running which messed with the .libPaths() where all the packages were installed.

Thanks for taking a look! I'll close the issue now