Open lsbarrientos50 opened 9 years ago
I'm having this exact same problem now. Was there ever a clear solution to this?
Same problem... if anyone has a solution please post it here. Thanks.
Can someone upload their config file, please? Also, it may be easiest to diagnose the config file problem by shortening it to contain only 1-2 samples in order to see if the problem is in the naming approach being used.
I have tried several different versions of the config file and none worked. This is the latest one:
[adapters] i7:GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCTCGTATGCCGTCTTCTGCTTG i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTGTGTAGATCTCGGTGGTCGCCGTATCATT [tag sequences] i7-112_06:CCGGAATT i5-19_F:TGGCTCTT [tag map] aegyptiaca_S34:i7-112_06,i5-19_F [names] aegyptiaca_S34:aegyptiaca
The following config file appears to work fine for me:
[adapters]
i7:GATCGGAAGAGCACACGTCTGAACTCCAGTCAC*ATCTCGTATGCCGTCTTCTGCTTG
i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT*GTGTAGATCTCGGTGGTCGCCGTATCATT
[tag sequences]
i7-112_06:CCGGAATT
i5-19_F:TGGCTCTT
[tag map]
aegyptiaca_S34:i7-112_06,i5-19_F
[names]
aegyptiaca_S34:aegyptiaca
when the input are two files in a directory named test
, as below:
aegyptiaca_S34_T3_R1_001.fastq.gz
aegyptiaca_S34_T3_R2_001.fastq.gz
These are not your data, but two files made to look like your data. I formed the names so they would be like what illumiprocessor
expects. The tree of the input and output folders looks like this:
.
├── test
│ ├── aegyptiaca_S34_T3_R1_001.fastq.gz
│ └── aegyptiaca_S34_T3_R2_001.fastq.gz
├── clean
│ └── aegyptiaca
│ ├── adapters.fasta
│ ├── raw-reads
│ │ ├── aegyptiaca-READ1.fastq.gz -> /home/bcf/tmp/test/test/aegyptiaca_S34_T3_R1_001.fastq.gz
│ │ └── aegyptiaca-READ2.fastq.gz -> /home/bcf/tmp/test/test/aegyptiaca_S34_T3_R2_001.fastq.gz
│ ├── split-adapter-quality-trimmed
│ │ ├── aegyptiaca-READ1.fastq.gz
│ │ ├── aegyptiaca-READ2.fastq.gz
│ │ └── aegyptiaca-READ-singleton.fastq.gz
│ └── stats
│ └── aegyptiaca-adapter-contam.txt
├── illumiprocessor.log
└── test.conf
Thanks for the answer. For now, I bypassed the problem by running trimmomatic directly. Haven't yet solved this issue.
The problem seems to be solved by including the name of the config file in the "--config" line of the command. Thanks to my colleague Jackson Eyres for solving this.
However, now that it is solved, I have the other issue with the read names, frequently encountered by other users here:
"errors in your conf file.".format(self.start_name)) IOError: There is a problem with the read names for morelia-viridis1_GCCTTCA. Ensure you do not have spelling/capitalization errors in your conf file."
I have tried all the solutions mentioned in the threads here, and renamed my files in numerous different ways, but none worked for me.
The illumiprocessor --help
command indicates that the --config
file needs to be passed to the program. Similarly, the example code that I sent you also includes the config file in the program invocation, following --config
, so it should not be a surprise that the file name is needed.
The second issue you are encountering is frequently encountered because users enter their file names incorrectly or because their fastq files have a naming format that is different from what is expected by the program. Basically, illumiprocessor
is having a hard time finding the correct read pairs that go with the name that you have provided for morelia-viridis1_GCCTTCA
. This is either because the read files have slightly different names from what you entered or because the format of the read files is different than what illumiprocessor
expects (the test data you sent me worked fine, so may not be format). You'll need to figure out which of those is causing the problem. If you need to change the general naming format for which illumiprocessor is searching, that requires changing the regular expressions that are used to find read files. By default, those regular expressions are:
r1_pattern = "{}_(?:.*)_(R1|READ1|Read1|read1)_\d+.fastq(?:.gz)*"
r2_pattern = "{}_(?:.*)_(R2|READ2|Read2|read2)_\d+.fastq(?:.gz)*"
In this case, the name morelia-viridis1_GCCTTCA
is substituted where the squiggly braces ({}
) are in the example above, then the regular expression is constructed. You can set that to whatever is needed to find your R1 and R2 files. Because R1 and R2 come from Illumina sequencers (or sequencing providers) in all sorts of naming combinations, you're able to adjust what the program looks for (the --r1-pattern
and --r2-pattern options
).
Thanks.
Now it's working, not sure why. I might have mixed underlines with hyphens in the species names.
I also have a problem like this, I don't know what does means. Is there some error with my config file? if anyone konw how to slove this problem,
please post it here.thanks
my config file as below:
[adapters] i7:GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCTCGTATGCCGTCTTCTGCTTG i5:AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTGTGTAGATCTCGGTGGTCGCCGTATCATT
[tag sequences] i7-7893:GAATCTGT i5-7893:ATAGCGAC
[tag map] sl337893:i7-7893,i5-7893
[names] sl337893:sl337893
The names of input file as below:
sl337893_T1_R1_7893.fastq.gz sl337893_T1_R2_7893.fastq.gz
See above. You'll likely need to adjust either your config file or the regular expression to deal with your file names.
I've dealt with my problems. Thank you very much.
Hi Brant, I am having issues with config file when try to run illumiprocessor:
for a single-indexed library, i7 is my index primer sequence and i5 is the NEBNEXT universal primer sequence. Two files of BRF27_POM134_S27_R1_001.fastq.gz and BRF27_POM134_S27_R2_001.fastq.gz are inside a folder POM134
The script I am using is: module load phyluce/1.6.8 module load illumiprocessor/2.0.9 illumiprocessor --input POM134 --output clean --config illumiprocessor.conf --cores 4 --trimmomatic ${TRIMMOMATIC_HOME}/bin/trimmomatic
The config file is as below:
[adapters] i7:CAAGCAGAAGACGGCATACGAGAT*GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT i5:AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT
[tag sequences] i7-P1-A1:TTACCGAC
[tag map] BRF27_POM134_S27:i7-P1-A1
[names] BRF27_POM134_S27:C.sp
I am getting this error many times: "errors in your conf file.".format(self.start_name)) IOError: There is a problem with the read names for BRF27_POM134_S27. Ensure you do not have spelling/capitalization errors in your conf file. Can you help how I can fix it?
It looks like your reads are named in a way that is not expected by phyluce. This means that you need to change the regular expression used to match the read data given the read names. The following should do it, although I have not tested:
--r1_pattern "{}_(R1)_\d+.fastq(?:.gz)*"
--r2_pattern "{}_(R2)_\d+.fastq(?:.gz)*"
Also, when you rename your reads in the [names]
section, you need to use something that does not contain most symbols (underscore "_" is ok). For example:
[names]
BRF27_POM134_S27:C_sp
Not sorted
I have problems with the ConfigParser.py it could't find the section "names" in my config file. I double check that all the names will be ok and the path is ok. I try to find if the ConfigParser.py have some problem but I can't find one. I checked if is a problem in the installation of illumiprocessor or the trimmomatic or java and they look ok. I try I reinstall everything and the problem still be there.