Open alexkrohn opened 3 years ago
You can’t really do it that way. Your best bet is to use some dummy sequences for the indexes. The quality trimming will proceed, as well. Alternatively, just trim them for quality externally, get them in the expected format and assemble using phyluce tools.
Got it. I just pasted one of the random tags onto the end of the individuals. I assume illumiprocessor will search for the tag, not find it, and move on to trimming, as you say.
Unfortunately, I got this error, which I don't quite understand. I figure the file should exist in order for the pipeline to work on it :-D
Does this have to do with the indexes, or is it something else entirely
Traceback (most recent call last):
File "/home/tangled/tbc/compute/alex_compute/miniconda2/envs/phyluce-1.7.1/bin/illumiprocessor", line 17, in <module>
sys.exit(main())
File "/home/tangled/tbc/compute/alex_compute/miniconda2/envs/phyluce-1.7.1/lib/python3.6/site-packages/illumiprocessor/cli/main.py", line 114, in main
main(args)
File "/home/tangled/tbc/compute/alex_compute/miniconda2/envs/phyluce-1.7.1/lib/python3.6/site-packages/illumiprocessor/main.py", line 36, in main
core.create_new_dirs(reads)
File "/home/tangled/tbc/compute/alex_compute/miniconda2/envs/phyluce-1.7.1/lib/python3.6/site-packages/illumiprocessor/core.py", line 337, in create_new_dirs
os.symlink(reads, new_file)
FileExistsError: [Errno 17] File exists: '/home/tangled/tbc/compute/UCE/PIME/30-520350235_060421/raw-fastq/P.m.mg_Hernando_FL_S20_R1_001.fastq.gz' -> '/home/tangled/tbc/compute/alex_compute/UCEs/pime/combined-data/clean-fastq/S2/raw-reads/S2-READ1.fastq.gz'
for some reason it thinks the file already exists - you might have a duplicate file name somewhere in there.
Ahhh. I see now. It looks like it's conflating individual S20 with S2. I will correct the file names to be S20 and S02 to hopefully avoid that in the future.
Hi there. I have a dataset of demultiplexed FASTQ files that I generated that I would love to combine with data pulled from the SRA. The SRA data are demultiplexed and have their barcodes+adapters already trimmed off. However, like my data, the SRA raw reads still need to be trimmed for quality. I would love to use illumiprocessor to do the QC and trimming so that I can integrate all the reads together into the Phyluce pipeline.
How can I configure the illumiprocessor.conf file to not bother looking for adapters+barcodes in some of the files?
Let's say my files are named: old-reads_IND001_R1_001.fastq.gz old-reads_IND001_R2_001.fastq.gz new-reads_IND002_R1_001.fastq.gz new-reads_IND002_R2_001.fastq.gz
I've tried:
and
But neither seem to work. Any suggestions?