federicacitarrella / FusionFlow

FusionFlow is a bioinformatic pipeline that enables the detection of gene fusions from RNA and DNA data.
1 stars 2 forks source link

Become more confident with NextFlow, Docker, Github #6

Closed Bontempogianpaolo1 closed 3 years ago

Bontempogianpaolo1 commented 3 years ago

A good practice could be to open a virtual environment (Ubuntu) on local and test pipelines over small datasets

Bontempogianpaolo1 commented 3 years ago

To Dos:

federicacitarrella commented 3 years ago

I realized a Nextflow pipeline (fiveProcesses.nf) composed by five simple processes in Python and Perl (I tried also with Java but I had some problems) receiving as input a simple txt file (prova.txt).

I attached the project to GitHub (federicacitarrella/pipelineGeneFusions) using the following commands:

git init
git remote add origin https://github.com/federicacitarrella/pipelineGeneFusions.git
git config --global user.email "federica.citarrella14@gmail.com"
git add <filename> / git rm --cached <filename>
git commit -m '...'
git push [--set-upstream origin master]

Then I integrated the project with Docker. I created two simple Dockerfiles:

Dockerfile (1):

FROM ubuntu

RUN apt-get -y update
RUN apt-get -y install python3

Dockerfile (2):

FROM ubuntu

RUN apt-get -y update
RUN apt-get -y install perl

I realized two docker images using these Dockerfiles: sudo docker build -t '<image_name>' <path_to_the_directory_of_dockerfile>

I created a public Docker Hub repository to share the images: federicacitarrella/dockertest

Then I pushed the images using the following instructions:

sudo docker login -u <username>
sudo docker tag image_name federicacitarrella/dockertest:image_name
sudo docker push federicacitarrella/dockertest:image_name

To pull the images use the following command:

sudo docker pull federicacitarrella/dockertest:image_name

Then the fiveProcesses.nf file was modified specifying the image to use for each process (multiple container approach) using the following format:

process name {
  container 'image_name'

  '''
  do this
  '''
}

Finally to run the pipeline using docker I run the following command: sudo ./nextflow run fiveProcesses.nf -with-docker federicacitarrella/dockertest:image1 federicacitarrella/dockertest:image2

Bontempogianpaolo1 commented 3 years ago

Very well done! I have just few comments:

federicacitarrella commented 3 years ago

Sorry, I made some researches but I didn't get what I should insert in these two bash scripts in this case.