nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
301 stars 82 forks source link

Funannotate cannot read symlink to database #877

Open minhtrung1997 opened 1 year ago

minhtrung1997 commented 1 year ago

Are you using the latest release? Yes

Describe the bug I run the funannotate code on HPC by nextflow, this is the code

    container "${ workflow.containerEngine == 'singularity' && !params.singularity_pull_docker_container ?
        'quay.io/biocontainers/funannotate:1.8.9--pyhdfd78af_3':
        'quay.io/biocontainers/funannotate:1.8.9--pyhdfd78af_3' }"

    input:
    tuple val(meta), path(assemly), path(database)

    output:
        path '*' 
        path "versions.yml"           , emit: version

    when:
    task.ext.when == null || task.ext.when

    script:
    def args = task.ext.args ?: ''
    def prefix = task.ext.suffix ? "${meta.id}__${meta.name}${task.ext.suffix}".replaceAll("[ .]","_") : "${meta.id}__${meta.name}".replaceAll("[ .]","_")

    """
   export FUNANNOTATE_DB=${database}
    export EVM_HOME='/usr/local/opt/evidencemodeler-1.1.1/'
    export AUGUSTUS_CONFIG_PATH='/usr/local/config/'

    funannotate predict ${args} -i ${assemly} -o Annotation_${prefix} --cpus ${task.cpus} \
        -s "${meta.organism} sp" --isolate ${meta.name} -d ${database}

    cat <<-END_VERSIONS > versions.yml
    "${task.process}":
        funannotate: \$(echo \$(funannotate 2>&1) | grep 'version:' ))
    END_VERSIONS
    """
}

What command did you issue? Copy/paste the command used.

Logfiles

Command error:
  -------------------------------------------------------
  [Mar 09 12:42 PM]: OS: Debian GNU/Linux 10, 64 cores, ~ 132 GB RAM. Python: 3.8.12
  [Mar 09 12:42 PM]: Running funannotate v1.8.9
  [Mar 09 12:42 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
  [Mar 09 12:42 PM]: Skipping CodingQuarry as $QUARRY_PATH not found as ENV
  [Mar 09 12:42 PM]: Parsed training data, run ab-initio gene predictors as follows:
    Program      Training-Method
    augustus     busco          
    glimmerhmm   busco          
    snap         busco          
  [Mar 09 12:42 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
  [Mar 09 12:42 PM]: Genome loaded: 22 scaffolds; 20,738,633 bp; 17.30% repeats masked
  [Mar 09 12:42 PM]: Mapping 556,382 proteins to genome using diamond and exonerate
  [Mar 09 01:13 PM]: Found 192,293 preliminary alignments --> aligning with exonerate
  [Mar 09 01:20 PM]: Exonerate finished: found 663 alignments
  [Mar 09 01:20 PM]: Running BUSCO to find conserved gene models for training ab-initio predictors
  [Mar 09 01:20 PM]: BUSCO training of Augusus failed, check busco logs, exiting

When I open the log file, there is message exactly like #229:

INFO    ****************** Start a BUSCO 2.0 analysis, current time: 03/09/2023 03:59:13 ******************
ERROR   Impossible to read funannotate/dikarya

ERROR   BUSCO analysis failed !
INFO    Check the logs, read the user guide, if you still need technical support, then please contact mailto:support@orthodb.org

. So I feel that's error in reading symlink. As if I use conda, with absolute path of database, that's not happen. (Although after that, we also encounter various errors #103 #776 )

OS/Install Information