ncbi / pgap

NCBI Prokaryotic Genome Annotation Pipeline
Other
313 stars 90 forks source link

Repeated “completed permanentFail” messages #18

Closed gdp3 closed 5 years ago

gdp3 commented 5 years ago

My last few runs have all ended with multiple “completed permanentFail” messages and “docker exited with rc = 1” and yet all the output files were created. The timestamps on the output files indicate that they were created fairly early in the process compared to the full runtime, so I am confused. I have the all the files for the most recent such run, including a console log. If they would be useful, do I just attach the files here?

slottad commented 5 years ago

We are sorry for the trouble you have had. We'd be happy to look at any files you wish to send to us, especially cwltool.log. You might also want to run pgap.py with the --debug option enabled, this will place more useful files in the output directory. I am not sure of the file size limits on github, but if they fit, attaching them works well.

gdp3 commented 5 years ago

Attaching the console output as a start. The github file size limit appears to be 10MB, and cwitool.log is almost 450MB. If other files would be useful I can try my Box account. console log.txt

slottad commented 5 years ago

Thank you, however I don't see any helpful info in the console log. More files would be useful please.

gdp3 commented 5 years ago

Output files are here: [(https://uwmadison.box.com/s/m1fva94jtanq2hwj5xuy1dmt3nh16461)]

slottad commented 5 years ago

It looks like there is a bad locus tag in the Final_Bacterial_Package_asndisc_evaluate step, but I can't see what it is. Would you please run it with the debug flag enabled, and provide that output directory?

gdp3 commented 5 years ago

I don't have remote access to the machine in question -- I'm offsite until Monday, but will try then.

azat-badretdin commented 5 years ago

@slottad . Douglas. FYI: input locus tag in input yaml file is

locus_tag_prefix: 'S'

gdp3 commented 5 years ago

That is in fact the locus tag prefix -- this is one of the older locus_tagged genomes; I want to add some plasmids that never got deposited with the chromosome back in the day.

gdp3 commented 5 years ago

I reran the job with the debug flag; full output is here: [https://uwmadison.box.com/s/zkqxpsdoww01zewxrjfbftybshe66384]

azat-badretdin commented 5 years ago

Our standard diagnostic software (which is used in many areas of sequence analysis at NCBI, not only for PGAP) detected that PGAP software generated invalid locus tags. At this moment we are not sure what is wrong with these locus tags.

We are planning to add an exception for BAD_LOCUS_TAG_FORMAT in the future (internal ticket PGAPX-411). This will allow users to ignore bad locus tags.

For now, please just ignore it. Please have in mind that you do have all the necessary results in the output despite this error.

PS. I understand that receiving messages with this subject is unsettling.