Closed DonFreed closed 2 years ago
I was finally able to track this down.
The igenomes GRCh37 WholeGenomeFasta folder provides the following files:
$ aws s3 ls s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/
2017-04-13 04:01:50 4237 GenomeSize.xml
2017-04-13 04:01:51 3950 genome.dict
2017-04-13 04:01:52 3147288982 genome.fa
2017-04-13 04:02:00 714 genome.fa.fai
2017-04-13 04:02:00 49152 genome.fa.index
While the BWAIndex provides the following:
$ aws s3 ls s3://ngi-igenomes/igenomes/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/
PRE version0.5.x/
PRE version0.6.0/
2017-04-13 03:42:51 3147288982 genome.fa
2017-04-13 03:45:40 6563 genome.fa.amb
2017-04-13 03:45:41 870 genome.fa.ann
2017-04-13 03:45:41 3095694072 genome.fa.bwt
2017-04-13 03:45:58 773923497 genome.fa.pac
2017-04-13 03:46:11 1547847040 genome.fa.sa
The igenomes config has:
params {
// illumina iGenomes reference file paths
genomes {
'GRCh37' {
fasta = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/WholeGenomeFasta/genome.fa"
bwa = "${params.igenomes_base}/Homo_sapiens/Ensembl/GRCh37/Sequence/BWAIndex/genome.fa"
Both the fasta
and index
input in BWA pull in the "genome.fa" file in the SENTIEON_BWAMEM
process, leading to the name collision.
The bwa key in the igenomes config can be updated to stage the directory containing the reference index, rather than just the fasta file. This also makes the igenomes setting more consistent with the SENTIEON_BWAINDEX
process; both produce a directory containing bwa index files for the reference genome.
Description of the bug
Running the example pipeline on a local machine results in the following error:
Command used and terminal output
Relevant files
nextflow.log
System information
dev
branch atc8a693df