COMBINE-lab / GRASS

Graph-Regularized Annotation via Semi-Supervised learning
Other
6 stars 2 forks source link

ZeroDivisionError #4

Open bennigoetz opened 7 years ago

bennigoetz commented 7 years ago

Howdy,

My latest attempt at running GRASS dies with a ZeroDivisionError: integer division or modulo by zero. Tail of STDOUT with Python traceback copied below. Any suggestions on avoiding this error, or working around it? I'd really like to try GRASS on the project I'm working on.

Thanks,

Benni Goetz Bioinformatics Consulting Group Genome Analysis and Sequencing Facility University of Texas at Austin

Started iteration number: 4

combo_grass/junto.config Number of edges added/removed: 6489

Started iteration number: 5

combo_grass/junto.config Traceback (most recent call last): File "/home1/01863/benni/.local/bin/grass", line 4, in import('pkg_resources').run_script('grass==0.1.1', 'grass') File "/opt/apps/intel15/python/2.7.12/lib/python2.7/site-packages/pkg_resources/init.py", line 743, in run_script self.require(requires)[0].run_script(script_name, ns) File "/opt/apps/intel15/python/2.7.12/lib/python2.7/site-packages/pkg_resources/init.py", line 1498, in run_script exec(code, namespace, namespace) File "/home1/01863/benni/.local/lib/python2.7/site-packages/grass-0.1.1-py2.7.egg/EGG-INFO/scripts/grass", line 159, in processInput() File "/home1/01863/benni/.local/lib/python2.7/site-packages/click/core.py", line 722, in call return self.main(args, kwargs) File "/home1/01863/benni/.local/lib/python2.7/site-packages/click/core.py", line 697, in main rv = self.invoke(ctx) File "/home1/01863/benni/.local/lib/python2.7/site-packages/click/core.py", line 895, in invoke return ctx.invoke(self.callback, ctx.params) File "/home1/01863/benni/.local/lib/python2.7/site-packages/click/core.py", line 535, in invoke return callback(args, **kwargs) File "/home1/01863/benni/.local/lib/python2.7/site-packages/grass-0.1.1-py2.7.egg/EGG-INFO/scripts/grass", line 126, in processInput iterGrass.run(keys, finalLabelFile, juntoConfigFile, outdir) File "/home1/01863/benni/.local/lib/python2.7/site-packages/grass-0.1.1-py2.7.egg/grass/iterGrass.py", line 183, in run (avgNewWeight, temp) = addNewEdges(orgGraph, graph, contigToLabels, labelToContigs, ofile) File "/home1/01863/benni/.local/lib/python2.7/site-packages/grass-0.1.1-py2.7.egg/grass/iterGrass.py", line 131, in addNewEdges weightCalc /= changesMade ZeroDivisionError: integer division or modulo by zero

laraib85 commented 7 years ago

Hey,

I would like to refer you to our tool: https://github.com/COMBINE-lab/grouper , that includes an updated version of GRASS and integrates it with the clustering module. The particular issue you refer to here has also been resolved in it, along with several other enhancements. The input requirements are fairly similar. Let me know if that works for you.

bennigoetz commented 7 years ago

Hi,

Thanks for the quick reply and info. I heard that y'all were working on Grouper a few months ago, and am happy to see that it's been released into the wild. I've been out with a cold the last two days, and am just getting to this now. I just started a Grouper run, and it seems to be working well so far (at least for the last 5 minutes).

As an aside, I mostly use an HPC cluster at TACC here at UT. It's very nice having such a big cluster available, but jobs have a time limit of 48 hours. My first attempt at running GRASS on my current project was cancelled at 48 hours before GRASS finished. I have another machine available that I tried GRASS on a second time, where I got the zero division error. (Grouper is running on this other machine.) This is all a preface for a suggestion for later updates to Grouper. It would be really nice to have checkpointing built into the software. I don't understand exactly what's going on under the hood, but the iterations in the labelling step seem like a place were checkpointing built in. Just a suggestion.

Thanks again for the help!

Benni

On Tue, Oct 17, 2017 at 12:39 PM, Laraib Iqbal Malik < notifications@github.com> wrote:

Hey,

I would like to refer you to our tool: https://github.com/COMBINE- lab/grouper , that includes an updated version of GRASS and integrates it with the clustering module. The particular issue you refer to here has also been resolved in it, along with several other enhancements. The input requirements are fairly similar. Let me know if that works for you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/COMBINE-lab/GRASS/issues/4#issuecomment-337301850, or mute the thread https://github.com/notifications/unsubscribe-auth/AC2o8Qnr97zxK1aW2qzzyFgFMc8DDJXoks5stOYegaJpZM4P7K90 .

bennigoetz commented 7 years ago

I meant to add: will a paper come out describing how Grouper works? I read the RapClust and GRASS papers/manuscripts, as well as the referenced adsorption/labelling paper, so I have a rough idea of what's going on. But I'd be interested in the improvements, especially what the new orphan and mincut options mean.

Benni

On Thu, Oct 19, 2017 at 2:24 PM, Benjamin M Goetz benni@utexas.edu wrote:

Hi,

Thanks for the quick reply and info. I heard that y'all were working on Grouper a few months ago, and am happy to see that it's been released into the wild. I've been out with a cold the last two days, and am just getting to this now. I just started a Grouper run, and it seems to be working well so far (at least for the last 5 minutes).

As an aside, I mostly use an HPC cluster at TACC here at UT. It's very nice having such a big cluster available, but jobs have a time limit of 48 hours. My first attempt at running GRASS on my current project was cancelled at 48 hours before GRASS finished. I have another machine available that I tried GRASS on a second time, where I got the zero division error. (Grouper is running on this other machine.) This is all a preface for a suggestion for later updates to Grouper. It would be really nice to have checkpointing built into the software. I don't understand exactly what's going on under the hood, but the iterations in the labelling step seem like a place were checkpointing built in. Just a suggestion.

Thanks again for the help!

Benni

On Tue, Oct 17, 2017 at 12:39 PM, Laraib Iqbal Malik < notifications@github.com> wrote:

Hey,

I would like to refer you to our tool: https://github.com/COMBINE-lab /grouper , that includes an updated version of GRASS and integrates it with the clustering module. The particular issue you refer to here has also been resolved in it, along with several other enhancements. The input requirements are fairly similar. Let me know if that works for you.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/COMBINE-lab/GRASS/issues/4#issuecomment-337301850, or mute the thread https://github.com/notifications/unsubscribe-auth/AC2o8Qnr97zxK1aW2qzzyFgFMc8DDJXoks5stOYegaJpZM4P7K90 .

laraib85 commented 7 years ago

We have made updates in Grouper that improve the time complexity, especially on sparsely labeled datasets. However, if Grouper takes too long to run on your data, please do let me know. We would like to investigate the cause and work on improving it.

Grouper, after each iteration, print out the number of edges added in the form: "Size diff: ". For now, you can use this as a check. This number should be converging towards zero at each iteration.