Closed louisfnastasi closed 8 months ago
Things run fine for me with dummy data, a config file (test-2.conf
) that looks like this:
[adapters]
i7:CTGTCTCTTATACACATCTCCGAGCCCACGAGAC*ATCTCGTATGCCGTCTTCTGCTTG
i5:CTGTCTCTTATACACATCTGACGCTGCCGACGA*GTGTAGATCTCGGTGGTCGCCGTATCATT
[tag sequences]
i5-538:TGAGTCAG
i7-97:GACGTGAC
[tag map]
RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053_L003:i5-538,i7-97
[names]
RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053_L003:Amphibolips_quercusjuglans_CYNOG0048
and a command for illumiprocessor that looks like this:
illumiprocessor --input raw-data-2 --output clean-data-2 --config test-2.conf --r1-pattern "{}_R1_\d+.fastq.gz" --r2-pattern "{}_R2_\d+.fastq.gz"
Oops - hang on a second...
Ok - edited first reply to add brackets. That still seems to work A-ok.
Hi all,
I've seen the other threads on this issue (e.g., https://github.com/faircloth-lab/phyluce/issues/96 and https://github.com/faircloth-lab/phyluce/issues/208) but haven't found a suitable solution - sorry for yet another question like these!
I've been running the following:
illumiprocessor \ --input raw-fastq/ \ --output clean-fastq \ --config trim_testconfig.conf \ --cores 12 \ --r1-pattern "{}R1\d+.fastq.gz" \ --r2-pattern "{}R2_\d+.fastq.gz"
Our file names are formatted like so: RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053_L003_R1_001.fastq.gz RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053_L003_R2_001.fastq.gz
and just a brief example of the configuration file for the sample listed above:
[tag map] RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053L003:i5-plate-1,i7-WD11
[names] RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053L003:Amphibolips_quercusjuglans_CYNOG0048
The error output I'm receiving is identical to those previously posted: 2023-07-27 15:24:50,049 - illumiprocessor - INFO - ==================== Starting illumiprocessor =================== 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Version: 2.10 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --config: trim_test_config.conf 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --cores: 12 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --input: /storage/group/hmh19/default/trim_test/raw-fastq/raw-fastq 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --log_path: None 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --min_len: 40 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --no_merge: False 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --output: /storage/group/hmh19/default/trim_test/raw-fastq/clean-fastq 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --phred: phred33 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --r1pattern: {}R1\d+.fastq.gz 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --r2pattern: {}R2\d+.fastq.gz 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --se: False 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --trimmomatic: /storage/home/lfn5093/.conda/envs/illumi_env/bin/trimmomatic 2023-07-27 15:24:50,049 - illumiprocessor - INFO - Argument --verbosity: INFO Traceback (most recent call last): File "/storage/home/lfn5093/.conda/envs/illumi_env/bin/illumiprocessor", line 17, in
sys.exit(main())
File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/cli/main.py", line 114, in main
main(args)
File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/main.py", line 34, in main
reads.append(core.SequenceData(args, conf, start_name, end_name))
File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/core.py", line 85, in init
self._get_read_data()
File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/core.py", line 106, in _get_read_data
"errors in your conf file.".format(self.start_name)
OSError: There is a problem with the read names for RAPiD-Genomics_F300-F301_PST_174201_P001_WA01_i5-538_i7-97_S6053L003. Ensure you do not have spelling/capitalization errors in your conf file.
/var/spool/slurm/d/job4761542/slurm_script: line 42: phyluce_assembly_get_fastq_lengths: command not found
I've already tried numerous expressions for the r1 and r2 patterns including the following but none seem to work: {}R1\d+.fastq.gz {}R1\d+.fastq.gz {}R1_\d+.fastq(?:.gz) {}R1_001.fastq.gz {}R1_001.fastq.gz {}R1_001.fastq(?:.gz) {}R1001.fastq.gz {}R1\w+.fastq.gz {}R1\w+.fastq.gz* {}R1\w+.fastq(?:.gz)*
I'd greatly appreciate any help anyone can offer!