galaxyproject / usegalaxy-playbook

Ansible Playbook for usegalaxy.org
Academic Free License v3.0
30 stars 24 forks source link

Hifive v 0.1.0 failing for missing dependencies on roundup cluster (default for tool) #190

Closed jennaj closed 5 years ago

jennaj commented 5 years ago

Tool: hifive manipulate, analyze, and plot HiC and 5C chromatin interaction data (Galaxy Version 0.1.0)

MTS: https://toolshed.g2.bx.psu.edu/view/sauria/hifive/cc95aa05f643 (same version as at org and eu, so might be current unless there was a change without a version bump)

Dev repo: https://github.com/bxlab/hifive

Problem: Fails on the Roundup cluster immediately for missing dependency reasons (can vary depending on the form options selected). Works fine on Stampede and Jetstream.

ping @natefoo and @msauria -- issue seems new, first reported today 2/20/19

Workaround for End-Users: At the bottom of the tool form there is an option to choose the cluster to run the job at. The default cluster is Roundup, and jobs will fail immediately. Jetstream Stampede also fails but at a later time during the job processing and possibly not for all use-uses. Choose Stampede Jetstream instead with the warning that this is not yet entirely confirmed to work for all use-cases.

Workaround Graphics:

Scroll to the bottom of the tool form to find:

job-resource-selector-default

Click to open the expanded menu:

job-resource-selector-menu

Choose one of these alternative clusters "TACC Stampede (beta)" or "Jetstream" instead if you want to test it out:

job-resource-selector-options
jennaj commented 5 years ago

Update: Failing at Jetstream Stampede with a new error. Not sure of the root cause but looks like another dependency problem, one hit further downstream in the job processing:

End-users Avoid Jetstream Stampede too. Tests for Stampede Jetstream are still running to completion. Will update status when done.

(sorry for mix-ups!)

Fatal error: Exit code 1 ()
Traceback (most recent call last):
  File "/work/galaxy/main/deps/_conda/envs/mulled-v1-b4daa132dd86fc2cec92b9a62952099ce7df77ba469589a81d8d87001131e606/bin/hifive", line 849, in <module>
.... repeats many times then this ....
Traceback (most recent call last):
  File "/work/galaxy/main/deps/_conda/envs/mulled-v1-b4daa132dd86fc2cec92b9a62952099ce7df77ba469589a81d8d87001131e606/bin/hifive", line 849, in <module>
    main()

.... then repeats that block many times and ends with this line repeated many times ....

KeyError: "Unable to open object (object 'cis_data' doesn't exist)"
jennaj commented 5 years ago

Searched for "cis_data" and cannot tell if that could be produced by a usage error, content issue, or some problem with the tool wrapper. Help please @msauria -- think I've done all I can on my own to troubleshoot this. Thanks!

https://github.com/bxlab/hifive/search?q=cis_data&unscoped_q=cis_data

jennaj commented 5 years ago

Update: @natefoo is fixing the missing dependencies on the Roundup and Stampede clusters

natefoo commented 5 years ago

The hifive dependencies were not installed in /cvmfs/main.galaxyproject.org/deps/_conda. Pulsar on Jetstream and Stampede is able to install missing deps into private conda dirs, which is why they are not having the same problem.

I have installed them for roundup. Stampede is a separate issue, that looks like a tool problem to me, but I don't know.

jennaj commented 5 years ago

Thanks -- test rerunning, will close this out once they complete successfully (or hit that tool problem! -- if so, I'll open a ticket for that against the wrapper repo and create a different "update-tracking" ticket here)

jennaj commented 5 years ago

Thanks -- dependencies fixed across clusters. Going to close this out.

The Hifive repo has a closed issue about the KeyError: "Unable to open object (object 'cis_data' doesn't exist) problem here: https://github.com/bxlab/hifive/issues/12

My test data didn't have a chromosome name mismatch problem, but the input bed did have a different problem (specifically, was a BED3 dataset, so no strand info). Fixed that and am rerunning just to see what pops out -- but that was definitely an input content issue, not a wrapper problem.

The tool form help clearly states to use a bed that includes strand so the only usability improvement recommendation would be to have the tool check the input for obvious content issues and fail at that point, instead of later after potentially long partial data processing, along some sort of message about what the actual versus unexpected input was. @msauria can decide, no ticket created. I've only seen one report of an input problem like this -- so maybe doesn't come up very often.

If anyone else runs into this error, check for chromosome mismatches AND make sure that any bed inputs include strand. When the strand is unknown or does not apply, the default value of . can be used. Important: All six bed columns are required. Example lines from the tool's test_fend.bed dataset:

BED6 data: chrom, start, end, name, score (default=0), strand (default=.).

chr1    16007   16013   HindIII 0   .
chr1    24571   24577   HindIII 0   .
chr1    27981   27987   HindIII 0   .
chr1    30429   30435   HindIII 0   .

More help about Hifive can be found here (including Tutorials): http://hifive.docs.taylorlab.org/en/latest/index.html