ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
294 stars 89 forks source link

[BUG] PGAP fails on test genome when PGAP_INPUT_DIR set to other than default #310

Closed mdziurzynski closed 3 weeks ago

mdziurzynski commented 1 month ago

Hi!

I want to run the PGAP pipeline on my bacterial genome, however the pipeline crashes on the test genome.

I installed the pipeline just as described in the Quick Start in the wiki. The only thing I made different was that I changed the default installation path (I added PGAP_INPUT_DIR variable to my bashrc with appropriate path). The guide mentions that after installation, its a good idea to run the pipeline on the Mycoplasmoides genitalium genome provided with the installation. Unfortunately the PGAP run ends in 30 seconds with an error. I tried reinstalling and re-updating but it did not help.

Update

I reinstalled the pipeline in its default location and now it works. However I decided to leave the issue here as there seems to be an issue with directories mounting when the default PGAP installation location is changed.

Below I attach the cwl log.

cwltool.log

Env

Log Files The only files in tmp-outdir: ncbiapp.log fastaval.xml.txt

azat-badretdin commented 1 month ago

Thank you for your report, user @mdziurzynski

Upon examination of the cwltool.log file I concluded that somehow part of reference data is missing. I would recommend to repeat the installation step in Quick Start after cleanup in your .pgap directory as recommended and try again.

azat-badretdin commented 3 weeks ago

Did you manage to try to reinstall PGAP and run it successfully?

mdziurzynski commented 3 weeks ago

Yes, I had to install PGAP without changing the default PGAP_INPUT_DIR. Then it worked.

But just for the reference - I reinstalled multiple times (and redownloaded the data) with different PGAP_INPUT_DIR vars, and it did not work even once.

azat-badretdin commented 3 weeks ago

I reinstalled multiple times (and redownloaded the data) with different PGAP_INPUT_DIR vars, and it did not work even once.

Is it possible that the location of PGAP_INPUT_DIR was on the filesystem somehow inaccessible to Docker?

Feel free to open a separate ticket for one of the PGAP_INPUT_DIR settings? Information we are looking for is mount location of the working directory and PGAP_INPUT_DIR directory and the relationship of those mounts with docker settings