Error: Issue with producing output graphs

DR-genomics commented 2 years ago

Hello @clemgoub,

I ran dnaPipeTE for my samples. At the end of each run, I have fasta sequences in Trinity.fasta, along with a summary of repeat families listed in ".tbl" output files. However, none of the runs has output graphs. At the end of each run, I receive errors as below:

"/bin/sh: /gpfs20/mypath/dnaPipeTE/bin/parallel: No such file or directory /bin/sh: /gpfs20/mypath/dnaPipeTE/bin/parallel: No such file or directory /bin/sh: /gpfs20/mypath/dnaPipeTE/bin/parallel: No such file or directory Error in read.table(paste(folder, file1, sep = "/")) : no lines available in input Execution halted"

I don't have a parallel directory listed within the bin folder. Is that something comes along with the software? Any help will be much appreciated!

clemgoub commented 2 years ago

Dear @DR-genomics,

I am currently in the process of updating the dnaPipeTE repository, in particular with instructions to use the docker/singularity version of dnaPipeTE 1.3 which will solve the dependency problem you describe.

I suggest that you follow the instructions here: https://hub.docker.com/r/clemgoub/dnapipete to run the containerized version of the program. You can do so either with Docker (root privileges) or Singularity (non-root user). I will be happy to assist you if you have trouble installing it.

Regarding the graphs, this should also resolve the issue. In addition, I just created a toolkit with several scripts to process dnaPipeTE outputs and re-create the original graphs in a more customizable fashion. The repos was just published yesterday and the documentation is there, however please let me know if you encounter any issue.

Best,

Clément

DR-genomics commented 2 years ago

Thanks for the prompt response Clément! Good to know about the software update! I would like to try the singularity version of the software, as docker is not available in the cluster which I am using. Can you provide a link for the same?

And, along with the output graphs, reads_per_component_and_annotation file is missing as well. Are the graphs and reads_per_component_and_annotation files are linked to each other?

Thanks

clemgoub commented 2 years ago

Hi!

The instruction to use dnaPipeTE with Singularity are as follow:

1- First create a Singularity image from the Docker container

mkdir ~/dnaPipeTE
cd ~/dnaPipeTE
singularity pull --name dnapipete.img docker://clemgoub/dnapipete:latest

This step requires approximately 20 minutes to complete. However, it is only required once for installation.

2- Assuming you have a project folder with your data in ~/data. We create a file that will contain the commands for the run. For example:

cd ~/data
touch dnaPipeTE_cmd.sh

With the text editor of your choice, edit dnaPipeTE_cmd.sh with the commands for dnaPipeTE. For example:

cd /opt/dnaPipeTE
python3 dnaPipeTE.py -input /mnt/SRR14470610.mt.clean.R1.fastq -output /mnt/dnaPipeTE_0.15_1_t20 -genome_size 180000000 -genome_coverage 0.15 -sample_number 2 -RM_lib ../RepeatMasker/Libraries/RepeatMasker.lib -RM_t 0.2 -cpu 8

The first line is required to execute the scripts in the right directory of the container

The second line is a standard dnaPipeTE command

/mnt is the default directory in the Singularity container where a user directory can be mounted to access and write data outside the container. In this example, /mnt in the container will points towards ~/data in your machine. It will be specified in the next command, that actually starts the container, mount the user data and run the program.

3- Start a run

singularity exec --bind ~/data:/mnt ~/dnaPipeTE/dnapipete.img bash /mnt/dnaPipeTE_cmd.sh

Note that --bind is the command that indicates where the data are located outside the container. In this example ~/data. This directory will also be where the output folder dnaPipeTE_0.15_1_t20 will be created

DR-genomics commented 2 years ago

Thank you for the detailed instructions! I could run dnaPipeTE via singularity without issues. Except, it didn't produce landscapes.pdf on its own, however I used one of your utility script (dnaPT_landscapes.sh) to get the same. Currently, running dnaPipeTE with the species specific repeat database.

Thank you!

clemgoub commented 2 years ago

Excellent! Thank you for letting me know and please don't hesitate if you need further help!

Cheers,

Clément

clemgoub / dnaPipeTE

Error: Issue with producing output graphs #63