nf-core / rnavar

gatk4 RNA variant calling pipeline
https://nf-co.re/rnavar
MIT License
34 stars 31 forks source link

VEP cache is not working #126

Closed nschcolnicov closed 7 months ago

nschcolnicov commented 7 months ago

Description of the bug

I got the error: MSG: ERROR: Cache directory vep_cache/homo_sapiens not found` When running the module NFCORE_RNAVAR:RNAVAR:VCF_ANNOTATE_ALL:VCF_ANNOTATE_MERGE:ENSEMBLVEP_VEP

I saw that, similarly to what happened with snpEff, the conditional statement that checks if the vep_cache is equal to "s3://annotation-cache/vep_cache/" is missing a "/" in https://github.com/nf-core/rnavar/blob/dev/workflows/rnavar.nf lines 154 and 162.

However, if I add the missing "/" and run the command: nextflow run main.nf -profile docker,test --annotate_tools merge --outdir. It still doesn't work because it fails to build the right path, and raises the error: This path is not available within annotation-cache. Please check https://annotation-cache.github.io/ to create a request for it.

The test profile is a modified version that removes the parameters vep_cache = null and snpeff_cache = null

Command used and terminal output

No response

Relevant files

No response

System information

No response

maxulysse commented 7 months ago

which build are you trying?

nschcolnicov commented 7 months ago

dev, I already spotted the issue: First is the conditional statements missing the "/" Second is that the vep_cache_dir variable in line 159 is building the wrong path:

if (params.vep_cache && params.annotate_tools && (params.annotate_tools.split(',').contains("vep") || params.annotate_tools.split(',').contains("merge"))) {
    def vep_annotation_cache_key = ''
    if (params.vep_cache == "s3://annotation-cache/vep_cache") {
        vep_annotation_cache_key = "${params.vep_cache_version}_${params.vep_genome}/"
    } else {
        vep_annotation_cache_key = params.use_annotation_cache_keys ? "${params.vep_cache_version}_${params.vep_genome}/" : ""
    }
    def vep_cache_dir = "${vep_annotation_cache_key}${params.vep_cache_version}_${params.vep_genome}/${params.vep_species}"
    def vep_cache_path_full = file("$params.vep_cache/$vep_cache_dir", type: 'dir')

For the test profile we define this:

    snpeff_genome     = 'WBcel235'

    vep_cache_version = 110
    vep_genome        = 'WBcel235'
    vep_species       = 'caenorhabditis_elegans'

So the vep_annotation_cache_key contains "110_WBcel235" The vep_cache_dir contains "110_WBcel235110_WBcel235/caenorhabditis_elegans"

While the actual path is s3://annotation-cache/vep_cache/110_WBcel235/caenorhabditis_elegans/

I'll create a PR for it

@maxulysse I didn't see this when fixing the issue for snpEff because I was already using the path to the VEP cache instead of using the automatic path builder

nschcolnicov commented 7 months ago

@maxulysse I created a PR for this, let me know what you think! https://github.com/nf-core/rnavar/pull/127

nschcolnicov commented 7 months ago

Pr fixed the issue, closing this issue!