HajkD / LTRpred

De novo annotation of young retrotransposons
https://hajkd.github.io/LTRpred/
GNU General Public License v2.0
45 stars 8 forks source link

Installation of dependencies fail - Docker wrapper needed #14

Closed mdozmorov closed 4 years ago

mdozmorov commented 4 years ago

The LTRpred package requires six command-line tools. Each of them has its own dependencies and/or require sudo privileges. Testing with Docker containers using different OS versions failed, installation of tools breaks with various errors. Different tools break under different conditions. After hours of attempts, it seems impossible to install all the dependencies to test the package.

A Docker file containing all the dependencies and the LTRpred package is needed.

HajkD commented 4 years ago

Dear Mikhail,

I am very sorry that you had difficulties to install all command line tools.

I very much like your suggestion to install all command line tools and their various dependencies within a Docker container (including LTRpred itself).

Do you think a conda environment would also be useful?

I will start working on this!

mdozmorov commented 4 years ago

Not sure it'll be possible to create a single conda recipe for so many tools. Docker seems best as it provides the relevant OS environment. I've tried https://hub.docker.com/r/rocker/rstudio to have RStudio, but again, ran in multiple issues with dependencies. A single Docker file that installs the dependencies, R, and LTRpred with all its dependencies seems the only option

HajkD commented 4 years ago

Thank you so much for this insight. In this case I will focus on the Docker file that installs everything and will come back to you shortly.

Once more, please accept my apologies for the inconvenience caused, but I hope this Docker file will make the life for many future users easier.

Many thanks!

HajkD commented 4 years ago

Dear Mikhail,

Please find enclosed a Dockerfile for LTRpred that will run the example LTRpred::LTRpred(...) function.

https://github.com/HajkD/LTRpred/blob/master/Dockerfile

This Dockerfile can be used to create a ltrpred container as follows:

# assuming the Dockerfile is in the current working directory
docker build -t ltrpred .
# run LTRpred::LTRpred(...) example via docker 
docker run ltrpred

In the next days, I will also add an RStudio interface via rocker and design a way for users to interact with the LTRpred docker container (passing either scripts or arguments to the LTRpred run).

You were absolutely correct, during the past months new dependency conflicts emerged between different tool versions. I resolved them now in the Dockerfile and will also update them in the Vignette.

I hope this Dockerfile allows you to run LTRpred seamlessly and once more thank you so much for motivating this approach.

Best wishes, Hajk

mdozmorov commented 4 years ago

Thanks, Hajk, this is a great start! The Docker image compiles, takes some time. Looking forward to the full version when one can play with LTRpred via RStudio interface.

mdozmorov commented 4 years ago

Following the instructions:

# assuming the Dockerfile is in the current working directory
docker build -t ltrpred .
# running ltrpred container
docker run --rm -ti ltrpred
# open R
:/app# R

Now in the R prompt:

# all tools are installed and the example LTRpred::LTRpred(...) run etc can be used
LTRpred::LTRpred(genome.file = system.file("Hsapiens_ChrY.fa", package = "LTRpred"))

I'm getting

LTRpred::LTRpred(genome.file = system.file("Hsapiens_ChrY.fa", package = "LTRpred"))
Error in loadNamespace(name) : there is no package called ‘LTRpred’

I see it is being installed in the Docker file, but something is not working. Installing within the container results in

ERROR: dependency ‘ggbio’ is not available for package ‘LTRpred’
* removing ‘/Users/mdozmorov/Library/R/3.6/library/LTRpred’
Error: Failed to install 'LTRpred' from GitHub:
  (converted from warning) installation of package ‘/var/folders/jq/hvjs8pl55sx0mtxlptkms_cc6y3f75/T//RtmpWeE9C6/fileed90376dfe7b/LTRpred_1.1.0.tar.gz’ had non-zero exit status

Some more polishing is needed.

HajkD commented 4 years ago

Dear Mikhail,

Many thanks for letting me know. I am still troubleshooting. It seems like these packages are not yet compatible with the most recent R version. I have to find a way around this.

The reason why it worked on my side was that parts of the previous built were stored in my Docker cache. This way, my test example worked.

Now when running:

docker build --no-cache -t ltrpred .

I could reproduce the issue and found out that there are issues with the following packages:

Skipping 6 packages not available: GenomicRanges, GenomeInfoDb, Biostrings, IRanges, ggbio, biomaRt

I am very sorry for the delay, but I hope to be able to create a stable Dockerfile version so that I can then push it to docker hub.

Many thanks and best wishes, Hajk

mdozmorov commented 4 years ago

Guess it can be R/Bioconductor version conflict. Will be waiting.

HajkD commented 4 years ago

Dear Mikhail,

I finally managed to develop two types of ltrpred containers:

1) Running the LTRpred pipeline through the R prompt command line: https://hub.docker.com/repository/docker/drostlab/ltrpred/. 2) Running the LTRpred pipeline through RStudio Server: https://hub.docker.com/repository/docker/drostlab/ltrpred_rstudio.

I now documented in detail how to install and run both types of docker images here: https://hajkd.github.io/LTRpred/articles/Introduction.html#ltrpred-docker-container

A minimal example to get you started right away is the following:

# retrieve docker image from dockerhub
docker pull drostlab/ltrpred
# run ltrpred container
docker run --rm -ti drostlab/ltrpred
# start R prompt within ltrpred container
~:/app# R

Within the ltrpred container R prompt run the ltrpred example:

LTRpred::LTRpred(genome.file = system.file("Hsapiens_ChrY.fa", package = "LTRpred"))

The RStudio variant can be found here: https://hajkd.github.io/LTRpred/articles/Introduction.html#download-ltrpred_rstudio-container-for-use-with-rstudio-server

Please accept my apologies for taking a while to implement and deploy these containers, but I wanted to do it properly. I tried many different ways of how users may be able to interact with the containers (how to use local genome files or local versions of the Dfam database) and now came up with the most parsimonious way as is documented here:
https://hajkd.github.io/LTRpred/articles/Introduction.html#ltrpred-docker-container

I hope you find these new versions useful and I very much look forward to hearing your feedback.

Thank you so much for all your excellent help with this. I am sure that users will very much appreciate this new style of working with LTRpred.

With very best wishes, Hajk