dnaPipeTE.sif - Githubissues

lablancoberdugo commented 2 years ago

Hello Clément, I was wondering if I singularity pull docker://clemgoub/dnapipete i will get a .sif file. can I just use that container to run dnaPipeTE or do I need to run it as an image with a .sh file as you explain in duckerhub?

clemgoub commented 2 years ago

Hello Laura,

Yes, the pull will create as .sif (I rename it .img, but this is the same thing). And yes you can use it directly, especially in interactive mode on a server. I use an intermediate script for batch analyses (or long ones) because the first step needs to be a cd to /opt/dnaPipeTE/ in the container, followed by the actual dnaPipeTE command. There is possibly a more elegant way to do it (and possible in one single singularity command), but it worked for me so I kept this approach ;).

Cheers,

Clément

lablancoberdugo commented 2 years ago

thank you for the quick reply. I guess I have a hard time understanding some of the instructions, for example:

cd /opt/dnaPipeTE python3 dnaPipeTE.py

Once I obtain the dnapipete_latest.sif I am not 100% sure where I would find the dnaPipeTE.py

clemgoub commented 2 years ago

No problem!

So your dnapipete_latest.sif is the singularity image that contains dnaPipTE.py and all the dependancies. To access it, you have two ways: interactively with singularity shell ... <image.img/sif> or using a script that will include all the different commands to execute in the image, with singularity exec ... <image.img/sif> <script>. Basically, singularity will start a virtual machine based on the .sif file. Once initiated, you have access to a shell operating within the image/container.

In both cases, the main python script of dnaPipeTE is located in /opt/dnaPipeTE (this path is only seen within the image); this is the reason why it is first required to move to this directory once you are starting the image.

The second part consist of running dnaPipeTE. For this, and once moved to /opt/dnaPipeTE you can use any standard command for the program, that will be launched with python3 ./dnaPipeTE.py ....

However, at this step, you need to access files (input and output) that are located outside the image/container, on your local machine. To do so, we need to mount a local directory within the image. Let's say you have a file called ~/Project/ on your local machine, the goal is to create a link in the image/container that will mirror this folder. We do so with --bind ~Project:/mnt and the local folder called ~Project will be seen as /mnt in the image, and will have all the files and directory present in your local machine. It works the same for the files written: all the outputs produced in the image/container need to be stored in /mnt to be seeable later in ~/Project in your local computer.

in interactive mode, you will be able to launch the image/container with
```
singularity shell --bind ~Project:/mnt dnapipete_latest.sif 
```
You see a new shell appearing (within the image). In this new shell (within the imagE) you can do ls /mnt, you should see the content of ~Project in your local machine. The first step is to cd to the folder with the program executables:
```
cd /opt/dnaPipeTE
```
and then you can type your command, for example
```
python3 dnaPipeTE.py -input /mnt/reads_input.fastq -output /mnt/output -RM_lib ../RepeatMasker/Libraries/RepeatMasker.lib -genome_size 170000000 -genome_coverage 0.1 -sample_number 2 -RM_t 0.2 -cpu 2
```
In this case, we assume that your input data (and futur outputs) will be in ~Project (but this is seen as /mnt in the image)
In case you want to run multiple experiment in parallel, or without have keep the shell opened during the analysis, I recommend to use a script that will include the command to execute within the image (1/ cd to /opt/dnaPipeTE and 2/ python3 dnaPipeTE.py ...)
```
singularity exec --bind ~Project:/mnt ~/dnaPipeTE/dnapipete_latest.sif  /mnt/dnaPipeTE_cmd.sh
```
You thus need to create the file dnaPipeTE_cmd.sh in ~/Project such as it will be found in /mnt in the image. The dnaPipeTE_cmd.sh file can look like that:
```
#! /bin/bash 
cd /opt/dnaPipeTE 
python3 dnaPipeTE.py -input /mnt/reads_input.fastq -output /mnt/output -RM_lib ../RepeatMasker/Libraries/RepeatMasker.lib -genome_size 170000000 -genome_coverage 0.1 -sample_number 2 -RM_t 0.2 -cpu 2 
```

I hope this clarify things on your side, but don't hesitate to ask more questions!!

Cheers,

Clément

lablancoberdugo commented 2 years ago

thank you very much Clemente, for some reason, my script doesn't seem to be workin.

Singularity dnapipete.img:/projects/dumont-lab/Laura/eQTL/dnaPipeTE> python3 dnaPipeTE.py
python3: can't open file 'dnaPipeTE.py': [Errno 2] No such file or directory

I will keep trying to see if there is something on my end. Thanks for helping!

clemgoub commented 2 years ago

Hello Laura,

Can you send me or share here the scripts you used? Thank you!

Clém

clemgoub commented 1 year ago

Please re-open if you need further assistance with this issue! Best, Clément

clemgoub / dnaPipeTE

dnaPipeTE.sif #67