clemgoub / dnaPipeTE

dnaPipeTE (for de-novo assembly & annotation Pipeline for Transposable Elements), is a pipeline designed to find, annotate and quantify Transposable Elements in small samples of NGS datasets. It is very useful to quantify the proportion of TEs in newly sequenced genomes since it does not require genome assembly and works on small datasets (< 1X).
48 stars 11 forks source link

dnaPipeTE.sif #67

Closed lablancoberdugo closed 1 year ago

lablancoberdugo commented 2 years ago

Hello Clément, I was wondering if I singularity pull docker://clemgoub/dnapipete i will get a .sif file. can I just use that container to run dnaPipeTE or do I need to run it as an image with a .sh file as you explain in duckerhub?

clemgoub commented 2 years ago

Hello Laura,

Yes, the pull will create as .sif (I rename it .img, but this is the same thing). And yes you can use it directly, especially in interactive mode on a server. I use an intermediate script for batch analyses (or long ones) because the first step needs to be a cd to /opt/dnaPipeTE/ in the container, followed by the actual dnaPipeTE command. There is possibly a more elegant way to do it (and possible in one single singularity command), but it worked for me so I kept this approach ;).

Cheers,

Clément

lablancoberdugo commented 2 years ago

thank you for the quick reply. I guess I have a hard time understanding some of the instructions, for example:

cd /opt/dnaPipeTE python3 dnaPipeTE.py

Once I obtain the dnapipete_latest.sif I am not 100% sure where I would find the dnaPipeTE.py

clemgoub commented 2 years ago

No problem!

So your dnapipete_latest.sif is the singularity image that contains dnaPipTE.py and all the dependancies. To access it, you have two ways: interactively with singularity shell ... <image.img/sif> or using a script that will include all the different commands to execute in the image, with singularity exec ... <image.img/sif> <script>. Basically, singularity will start a virtual machine based on the .sif file. Once initiated, you have access to a shell operating within the image/container.

In both cases, the main python script of dnaPipeTE is located in /opt/dnaPipeTE (this path is only seen within the image); this is the reason why it is first required to move to this directory once you are starting the image.

The second part consist of running dnaPipeTE. For this, and once moved to /opt/dnaPipeTE you can use any standard command for the program, that will be launched with python3 ./dnaPipeTE.py ....

However, at this step, you need to access files (input and output) that are located outside the image/container, on your local machine. To do so, we need to mount a local directory within the image. Let's say you have a file called ~/Project/ on your local machine, the goal is to create a link in the image/container that will mirror this folder. We do so with --bind ~Project:/mnt and the local folder called ~Project will be seen as /mnt in the image, and will have all the files and directory present in your local machine. It works the same for the files written: all the outputs produced in the image/container need to be stored in /mnt to be seeable later in ~/Project in your local computer.

I hope this clarify things on your side, but don't hesitate to ask more questions!!

Cheers,

Clément

lablancoberdugo commented 2 years ago

thank you very much Clemente, for some reason, my script doesn't seem to be workin.

Singularity dnapipete.img:/projects/dumont-lab/Laura/eQTL/dnaPipeTE> python3 dnaPipeTE.py
python3: can't open file 'dnaPipeTE.py': [Errno 2] No such file or directory

I will keep trying to see if there is something on my end. Thanks for helping!

clemgoub commented 2 years ago

Hello Laura,

Can you send me or share here the scripts you used? Thank you!

Clém

clemgoub commented 1 year ago

Please re-open if you need further assistance with this issue! Best, Clément