galaxyproject / galaxy

Data intensive science for everyone.
https://galaxyproject.org
Other
1.38k stars 999 forks source link

Running workflow/tool on large collection times out #12177

Open anilthanki opened 3 years ago

anilthanki commented 3 years ago

When trying to run a tool Concatenate datasets tail-to-head (cat) (Galaxy Version 0.1.0)or a one-step workflow of the tool on a large collection (with elements of 191K) keeps on timing out.

P.S. history contains many other large collections as well.

Moving collection to another history is not an option as it fails as well

Thanks

Anil

mvdbeek commented 3 years ago

Thanks @anilthanki, I assume that's the submission that times out, when you click on the execute button ? While the user experience is unfortunate you can navigate away from the site and the jobs will be created and run in the background. https://github.com/galaxyproject/galaxy/issues/11721 will address this (Creating jobs (especially mapped-over jobs). Will need access to tools.)

anilthanki commented 3 years ago

Hi @mvdbeek

Thanks for the quick response.. my apologies the tool was Concatenate datasets tail-to-head (cat) (Galaxy Version 0.1.0) not ete_classifier. I updated my previous comment..

mvdbeek commented 3 years ago

So is this when you click on the tool that there is a timeout, when you try to select a collection, or when you click on the execute button ?

nsoranzo commented 3 years ago

I think when trying to run the tool, it times out on the UI but goes on the backend, but the uWSGI process is then killed for using too much memory (we have reload-on-rss: 4096 set in config/galaxy.yml ) before it completes the job submission.

anilthanki commented 3 years ago

Thanks, @nsoranzo I had no idea what backend does in that case.

In the case of running workflow, it will try to load workflow and then error saying Times out

mvdbeek commented 3 years ago

Ah, yeah, we don't split scheduling there as we do for workflows.

In the case of running workflow, it will try to load workflow and then error saying Times out

If you see this again, can you take a screenshot ? Do you know if this is the simplified workflow form or the old one (the old one lets you send results to a new history and you can change parameters in the individual steps)

anilthanki commented 3 years ago

Hi @mvdbeek

I am attaching the screenshot of the error when I try to load the workflow.

Screenshot from 2021-06-21 11-58-02

nsoranzo commented 3 years ago

As a workaround, we were able to invoke the workflow via BioBlend.

mvdbeek commented 3 years ago

If you open the workflow run form on an empty history, does Galaxy use the simplified workflow run form or the old one ?

nsoranzo commented 3 years ago

If you open the workflow run form on an empty history, does Galaxy use the simplified workflow run form or the old one ?

Simplified:

Screenshot 2021-06-21 at 17-16-44 Galaxy