ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
316 stars 88 forks source link

Error: Final process status is permanentFail #304

Closed sekhwal closed 6 months ago

sekhwal commented 6 months ago

I installed PGAP and am running it using the following command. However, I am not getting output results files and the pgap generates an error

install

./pgap.py --update

running

python3 /scripts/pgap.py -r -o P16 input_P1620800_chr.yaml

error message

WARNING: memory per CPU core (GiB) is less than the recommended value of 2 PGAP failed, docker exited with rc = 1 Printing log starting from failed job: ..... WARNING Final process status is permanentFail

azat-badretdin commented 6 months ago

Thank you for your report, Dr. Kumar

Could you please attach cwltool.log file?

sekhwal commented 6 months ago

cwltool.log

sekhwal commented 6 months ago

It seems installed correctly. ./pgap.py PGAP version 2024-04-27.build7426 is up to date. Please help me to figure out the issue.

azat-badretdin commented 6 months ago

Please use some species taxid in your YAML file.

azat-badretdin commented 6 months ago

This is the same issue as #303

sekhwal commented 6 months ago

Please use some species taxid in your YAML file.

I am already using 'salmonella' in submol.yaml

topology: 'circular' location: 'chromosome' organism: genus_species: 'salmonella' strain: 'P1620800_chr'

azat-badretdin commented 6 months ago

1/ Salmonella is not "species", it's "genus" 2/ We lost the functionality of supporting "genus" option in this release and we are working on restoring it soon 3/ Case might be important (usually biologists always capitalize genus in binomials, so I am not familiar with this use case).

vappiah commented 2 months ago

Dear @azat-badretdin Please advice. I have similar error message. I used this command: _python ../pgap.py -c 30 -D /software/singularity/bin/singularity -r -d --no-internet -o pgapout -g S1.fasta --debug -s 'Vibrio cholera'

Below is the content of the fastaval.xml file

message tool="fastaval" severity="INFO" seq_id="S1C1" fasta_seq_id="lcl|S1C1" length="2961124"

And here is the cwltool.log

azat-badretdin commented 2 months ago

@vappiah

Log file says:

min = 3600000
max = 4850000
genome_size = 2961124
verify-genome-size: fail
verify-only-ns: pass
verify-seqids: pass

If you are sure you are not missing anyhing you could add --ignore-all-errors parameter.

vappiah commented 2 months ago

Thanks @azat-badretdin After adding the ignore error flag it works.

vappiah commented 2 months ago

Thanks @azat-badretdin . After adding the --ignore-all-errors . It worked

azat-badretdin commented 2 months ago

You are welcome, user @vappiah !

vappiah commented 2 months ago

Dear @azat-badretdin 4 of out of the 9 samples run successfully. For the failed ones I will be happy to get your advice. Here is one of the log files. cwltool.log

No fastaval file was generated.

Also is there a way to get a summary of the errors encountered. This will make it easy for me to diagnose the issue before reporting here.

azat-badretdin commented 2 months ago

the cwltool.log file you posted ends very abruptly:

+ /root/venv/bin/checkm qa --tab_table -f /pgap/output/debug/tmpdir/lev0iial/checkm.3467701149593024HkmV6y/fasta_by_scaffold/checkm.checkm.qa.o.1.txt -o 1 'taxonomy_wf-prot/Vibrio cholerae.ms' taxonomy_wf-prot/
    Finished parsing hits for 1 of 1 (100.00%) bins.
/CERR

Is that what you have on your end?

vappiah commented 2 months ago

Dear @azat-badretdin . Yes thats the log file.

However, I tried running using fewer resources (ulimit, cpus) and it worked. I guess the initial resources were too high for my system. Thanks.

azat-badretdin commented 2 months ago

I tried running using fewer resources (ulimit, cpus)

Looks like this was a right decision. We also recommend that in our documentation.