faircloth-lab / phyluce

software for UCE (and general) phylogenomics
http://phyluce.readthedocs.org/
Other
76 stars 48 forks source link

Phyluce pipeline - Illumiprocessor question #309

Closed louisfnastasi closed 8 months ago

louisfnastasi commented 10 months ago

I'm having some issues running Illumiprocessor and I can't find an answer in the documentation. Any idea what "KeyError"s in the output refer to (see below)? More generally, I'm trying to trim sequences from which the barcodes are already removed; my configuration file is below. Note that [tag sequences] and [tag map] are empty; I'm thinking this is likely responsible for the error. If so, how can I go about getting this to work? Thanks in advance for any assistance!

Configuration file: [adapters] i7:GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCTCGTATGCCGTCTTCTGCTTG i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTGTGTAGATCTCGGTGGTCGCCGTATCATT [tag sequences] [tag map] [names] SRR12384323:Zaeucoila_robusta_1339614_Blaimer_et_al SRR12384324:Xyalophora_sp1_1339617_Blaimer_et_al SRR12384325:Xyalaspis_sp1_1339604_Blaimer_et_al

Output from illumi:

2023-08-14 15:00:02,665 - illumiprocessor - INFO - ==================== Starting illumiprocessor =================== 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Version: 2.10 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --config: Blaimer_taxa_config.conf 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --cores: 32 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --input: /storage/group/hmh19/default/UCEs_working/Blaimer-et-al-fastqs 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --log_path: None 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --min_len: 40 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --no_merge: False 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --output: /storage/group/hmh19/default/UCEs_working/Blaimer-et-al-clean-fastq 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --phred: phred33 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --r1_pattern: {}_1.fastq.gz 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --r2_pattern: {}_2.fastq.gz 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --se: False 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --trimmomatic: /storage/home/lfn5093/.conda/envs/illumi_env/bin/trimmomatic 2023-08-14 15:00:02,665 - illumiprocessor - INFO - Argument --verbosity: INFO Traceback (most recent call last): File "/storage/home/lfn5093/.conda/envs/illumi_env/bin/illumiprocessor", line 17, in sys.exit(main()) File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/cli/main.py", line 114, in main main(args) File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/main.py", line 34, in main reads.append(core.SequenceData(args, conf, start_name, end_name)) File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/core.py", line 86, in init self._get_tag_data(conf) File "/storage/home/lfn5093/.conda/envs/illumi_env/lib/python3.6/site-packages/illumiprocessor/core.py", line 119, in _get_tag_data combo = tag_map[self.start_name] KeyError: 'SRR12384323'

brantfaircloth commented 10 months ago

If the samples are already trimmed, I'm not sure that you really need to run this step - you just need to make your directories/files look like what they do after running the step (if you are following the tutorial).

louisfnastasi commented 10 months ago

As I understand it these particular samples haven't been trimmed beyond removal of the barcode sequences. They still need to be trimmed beyond this. Any advice?

brantfaircloth commented 10 months ago

You could potentially use fake index sequences in the illumiprocessor file... but I'm not exactly sure what would happen. You could also choose to just trim with trimmomatic or another program (outside of phyluce) for quality (only), and the arrange the trimmed files in the fashion expected by phyluce.