crimBubble / ECCsplorer

The ECCsplorer is a bioinformatics pipeline for the automated detection of extrachromosomal circular DNA (eccDNA) from paired-end read data of amplified circular DNA.
GNU General Public License v3.0
18 stars 5 forks source link

TRIM_PATH checkup error when run test data #8

Closed liu-zhiyang closed 1 year ago

liu-zhiyang commented 1 year ago

When I ran the test data, I got an error showing as following.

2022-08-23 10:56:51,473 - [basic_checkups] INFO: Performing basic checkups.
2022-08-23 10:56:51,486 - [basic_checkups] ERROR:
Trimmomatic not found!

Please make sure Trimmomatic is installed.
If not in PATH modify location: config > TRIM_PATH.

Traceback (most recent call last):
  File "ECCsplorer.py", line 815, in <module>
    main()
  File "ECCsplorer.py", line 691, in main
    config.CONNECTOR = basic_checkups(args=args)
  File "ECCsplorer.py", line 417, in basic_checkups
    raise Exception(error_msg)
Exception:
Trimmomatic not found!

Please make sure Trimmomatic is installed.
If not in PATH modify location: config > TRIM_PATH.

2022-08-23 10:56:51,487 - [exit_err] ERROR: Sorry, something went wrong.

But I had already set the TRIM_PATH in lib/config.py.

# Trimmomatic
TRIM_PATH = '/home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic.jar'
TRIM_ADAPTER_PATH = '/home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/adapters'

When I change TRIM_PATH to 'trimmomatic' , it passed checkup but I got another error.

2022-08-23 14:11:53,463 - [main] INFO: Starting trimming. This might take a while.
2022-08-23 14:11:53,463 - [read_trimming] INFO: Trimming using: testdata/aDNA_R1.fastq and testdata/aDNA_R2.fastq as read files.
2022-08-23 14:11:53,471 - [read_trimming] INFO: Error: Unable to access jarfile trimmomatic
Finished!
2022-08-23 14:11:53,471 - [read_trimming] INFO: Trimming using: testdata/gDNA_R1.fastq and testdata/gDNA_R2.fastq as read files.
2022-08-23 14:11:53,477 - [read_trimming] INFO: Error: Unable to access jarfile trimmomatic
Finished!

Finally I changed TRIM _PATHto '/home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic' and added a .jar suffix after {path_} in TRIM_CMD, then it seemed that the pipeline could run correctly.

So maybe there is something wrong in the checkup step?

crimBubble commented 1 year ago

Thanks for the feedback.

Usually the pipeline should work with TRIM_PATH = '/home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic.jar'. as executing /home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic.jar (as run in the check up) or java -jar /home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic.jar (as run in the preparation module) should both work and give the same stdout.

This seems not to be the case for your setup. I am not sure why though. Can you please run both commands with the environment active and post the results here. Additionally can you post the content of the folder /home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/ (ls -all). Thanks.

liu-zhiyang commented 1 year ago

@crimBubble Thanks for your reply.

When I set TRIM_PATH = '/home/user/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic.jar' and ran test data, stdout was:

2022-08-23 20:31:17,271 - [setup_logging] INFO: Starting ECCsplorer pipeline with...
Output directory:          testdata
Prefix data set A:         TR
File 1 data set A (f1a):   testdata/aDNA_R1.fastq
File 2 data set A (f2a):   testdata/aDNA_R2.fastq
Prefix data set B:         CO
File 1 data set B (f1b):   testdata/gDNA_R1.fastq
File 2 data set B (f2b):   testdata/gDNA_R2.fastq
Reference genome sequence: testdata/RefGenomeSeq.fa
Custom BLAST+ database:    ['testdata/RefSeq_DB.fa']
Taxon:                     vir
Read trimming option:      tru3
Mapping window size:       100
Genome size:               ---
User read count:           not set, using max. available reads
Run mode:                  all
Image format:              png
Max threads used:          24
Logging to file:           No
2022-08-23 20:31:17,271 - [basic_checkups] INFO: Performing basic checkups.
2022-08-23 20:31:17,962 - [basic_checkups] ERROR:
Trimmomatic not found!

Please make sure Trimmomatic is installed.
If not in PATH modify location: config > TRIM_PATH.

Traceback (most recent call last):
  File "ECCsplorer.py", line 815, in <module>
    main()
  File "ECCsplorer.py", line 691, in main
    config.CONNECTOR = basic_checkups(args=args)
  File "ECCsplorer.py", line 417, in basic_checkups
    raise Exception(error_msg)
Exception:
Trimmomatic not found!

Please make sure Trimmomatic is installed.
If not in PATH modify location: config > TRIM_PATH.

2022-08-23 20:31:17,963 - [exit_err] ERROR: Sorry, something went wrong.

When I ran java -jar ~/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic.jar or ~/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/trimmomatic, both stdout was:

Usage:
       PE [-version] [-threads <threads>] [-phred33|-phred64] [-trimlog <trimLogFile>] [-summary <statsSummaryFile>] [-quiet] [-validatePairs] [-basein <inputBase> | <inputFile1> <inputFile2>] [-baseout <outputBase> | <outputFile1P> <outputFile1U> <outputFile2P> <outputFile2U>] <trimmer1>...
   or:
       SE [-version] [-threads <threads>] [-phred33|-phred64] [-trimlog <trimLogFile>] [-summary <statsSummaryFile>] [-quiet] <inputFile> <outputFile> <trimmer1>...
   or:
       -version

And ls -all ~/anaconda3/envs/eccsplorer/share/trimmomatic-0.39-2/ returned:

total 200
drwxr-xr-x  3 user user  4096 Aug 23 10:09 .
drwxr-xr-x 33 user user  4096 Aug 23 10:09 ..
-rw-rw-r--  2 user user  35147 Mar 28  2021 LICENSE
drwxr-xr-x  2 user user  4096 Aug 23 09:29 adapters
-rw-rw-r--  1 user user  9423 Aug 23 09:29 build_env_setup.sh
-rwxrwxr-x  2 user user  509 Mar 28  2021 conda_build.sh
-rw-rw-r--  2 user user  629 Mar 28  2021 metadata_conda_debug.yaml
-rwxrwxr-x  2 user user     3263 Mar 28  2021 trimmomatic
-rw-rw-r--  2 user user  128502 Mar 28  2021 trimmomatic.jar
crimBubble commented 1 year ago

@liu-zhiyang So it seems like the error results from the trimmomatic.jar not beeing set to be an executable. That is why the check up does not detect trimmomatic but during the run in can be executed with java -jar .../trimmomatic.jar.

There are several solutions to come by this issue. I will update the pipeline soon.

Thanks again for the feedback.