oschwengers / asap

A scalable bacterial genome assembly, annotation and analysis pipeline
https://doi.org/10.1371/journal.pcbi.1007134
GNU General Public License v3.0
70 stars 19 forks source link

User facing issue during QC step #16

Closed nickegg1018 closed 3 years ago

nickegg1018 commented 4 years ago

I'm an HPC sysadmin so I'm kind of playing the go-between here between you guys and my user on this one. The user is saying that every time he runs a pipeline it crashes during the QC step. Looking at his logs he's getting a java.lang.IllegalStateException. I can't seem to find anything on my end that would be causing this, so hopefully you can see something in his log file that might indicate what's going on. Thanks. CMH-dpheruth-asaplog.log

oschwengers commented 4 years ago

Hi @nickegg1018 , could you please also provide the isolate's qc log file. The global log file only provides information which steps for which isolates failed. In order to figure out if this is related to the data or a potential bug, I need the log of the underlying analysis: <project_dir>/reads_qc/<isolate_dir>/ stdout.log

Also, due to this log file, there's only a single isolate. In this case skipping all comparative analyses is required. You can skip these by setting the -c option at the asap-docker.sh script.

nickegg1018 commented 4 years ago

Sorry for the delay, I finally pried this out of the user's hands. CMH-dpheruth-stdoutlog.log

EDIT: I just posted this log without reading it. Now that I've opened it I'm a little stunned that he's still having this problem because I've definitely worked with him to fix it. Are there any gotchas with the Java/Groovy stuff the Docker image is using that I should be aware of? Because our cluster definitely has both of those things and he tells me he has them loaded into his environment.

oschwengers commented 4 years ago

Hi, hmm... now I'm a little bit puzzled as well. Could you please make sure, that Java as well as Groovy is installed correctly?

java -version
groovy --version

Depending on the Java version/package installed, Groovy sometimes fails to autodetect the right Java installation. In some cases, unset JAVA_HOME before executing ASA³P works fine.

Just to be sure, you've installed and using ASA³P natively, without Docker?

nickegg1018 commented 3 years ago

I have verified both Java and Groovy are installed and happy: [njeggleston@hiccup ~]$ module load java [njeggleston@hiccup ~]$ module load groovy [njeggleston@hiccup ~]$ java -version java version "9.0.4" Java(TM) SE Runtime Environment (build 9.0.4+11) Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode) [njeggleston@hiccup ~]$ groovy --version Groovy Version: 3.0.5 JVM: 9.0.4 Vendor: Oracle Corporation OS: Linux

I will have the user try unsetting JAVA_HOME, but I know before I had him load the java environment module (thus setting JAVA_HOME and other nifty $PATH things) he was getting errors about JAVA_HOME so I don't have high hopes there.

I have not installed it natively, but most of our software is built that way anyway. I don't see a config script or Makefile in the git repo, is there a convenient way to do that?

nickegg1018 commented 3 years ago

He tried unsetting JAVA_HOME, but to no avail. He sent me some more logs and I see an error I don't remember seeing before "Caught: groovy.lang.MissingPropertyException: No such property: taxPath for class: asap-qc". Any guess what would cause that?

I've attached the logs for your perusal. CMH-dpheruth-asap-13Oct20.log CMH-dpheruth-stderr-13Oct20.log CMH-dpheruth-stdout-13Oct20.log

oschwengers commented 3 years ago

Ok, there is a known bug caused by a wrong variable name in the asap-qc.groovy script. The terminate method in line 161 is called with wrong parameters:

old: terminate( "could create tmp dir! gid=${genomeId}, tmp-dir=${tmpPath}", t, taxPath, genomeName )

new: terminate( "could not create tmp dir! gid=${genomeId}, tmp-dir=${tmpPath}", t, genomeQCReadsPath )

This is fixed in https://github.com/oschwengers/asap/commit/dcb405f91d9bb2db8e54d0b293761c311367a4b1

You could just download the new asap-qc.groovy script version from current master and replace the old one. As I'm currently working on some other new features, I don't have the time for a patch release right know. However, this line should only be triggered if the tmp directory cannot be created. So for some unknown reason the next issue might be lurking. Please, let me know, if this helps and if I can further help somehow. Sorry, that I cannot provide a new path release right now. Instead, I hope to provide a new minor release soon.

nickegg1018 commented 3 years ago

I made the change in the code, we (as you predicted) got an error about writing to the tmp location (basically trying to write to /var/tmp instead of /tmp), once we addressed that and made sure JAVA_HOME was unset then we got a nearly successful run. He's now telling me only a few analyses failed. I'm getting further out of my element now so I'll pass along the failures and log files to you, please let me know if there's anything else I can get that will be helpful.

Failed analysis: Antibiotic resistance (log file Escherichia_colie_SCB5.std.og) Annotations (log file Escherichia_coli_SCB5.log and std.log) Core Pan Genome (log file corepan.std.log)

Skipped Analysis Virulence Factors (no log files)

corepan.std.log Escherichia_coli_SCB5.log Escherichia_coli_SCB5.std.log std.log

oschwengers commented 3 years ago

So step by step: