broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
997 stars 361 forks source link

Test SGE using Centaur #1180

Open ruchim opened 8 years ago

ruchim commented 8 years ago

Khalid mentioned that we need to incorporate SGE testing into Centaur soon and some thoughts that came up from today's tech talk:

ruchim commented 8 years ago

@kcibul Unsure what milestone this would belong in, but sending this over to you for prioritization!

katevoss commented 7 years ago

@ruchim @kcibul has this happened yet? If not, can you explain what this issue is for in more depth?

geoffjentry commented 7 years ago

It has not. It requires someone spending the time to figure out how to set up an ephemeral SGE cluster in travis for each test run.

katevoss commented 7 years ago

@geoffjentry sounds like a large engineering effort, would you agree? Are there a lot of users on SGE? How often is there a problem with Cromwell on SGE?

geoffjentry commented 7 years ago

I think @kshakir actually looked into it once, I don't know what would be involved. It's probably more a "we don't know how" than "if we did, it'd be a lot of work"

katevoss commented 7 years ago

@abaumann from Field Eng and @vdauwera from comms perspectives, do you have any idea of how many users are running Cromwell on SGE? Any particularly large or significant users?

geoffjentry commented 7 years ago

Also of note - at the time (probably) we had an "SGE backend". Now we have the config backend. So we could do the same with e.g. LSF. Outside of Broad we probably have more LSF users than SGE users. Inside Broad it'd be nearly 100% SGE. OTOH I don't know how well our SGE stuff works with UGER so perhaps not.

kshakir commented 7 years ago

Are there a lot of users on SGE?

I would also ask the methods team, say ldgauthier or LeeTL1220.

It's probably more a "we don't know how" than "if we did, it'd be a lot of work"

Yep, we are firmly in the camp of "we don't know how", with a heap of "we never rtfm'ed'.

There are a number of examples out there, and folks probably willing to help us, we just haven't prioritized this ticket. I'd estimate Travis/Dockering Grid Engine as medium effort, as others have already done it.

Example links for the inspired:

Speaking of Sun Microsystems, SGE is dead, as well as its successors OGE and an attempted-then-abandoned FOSS fork OGS. Long live SoGE, and UGE. It's fine to use "SGE", just like we use the term "JES", but we'll likely need to target specifically UGE for Broadies and/or SoGE for the rest of the Grid Engine world.

Outside of Broad we probably have more LSF users than SGE users.

True, there are lots of popular grid schedulers. I'd be more than happy to run yet-another-travis-job for whatever scheduler, if someone contribs the docker image / setup script like we have for Funnel.

I don't know how well our SGE stuff works with UGER so perhaps not.

Cromwell works great on BITS' newer UGE cluster named "UGER". I use cromwell frequently with concurrent-job-limit set to 900 due to our resource caps.

TL;DR Getting grid engine test support setup for a Broad-like environment is possible, just hasn't been a priority.

katevoss commented 7 years ago

@ldgauthier and @Leetl1220 do you know how many users use Cromwell with SGE?

As a SGE user, I want to the SGE config to be tested in Centaur, so that I can avoid regressions.

ldgauthier commented 7 years ago

In the MPG community, nobody's using Cromwell right now. I haven't used it with SGE myself. I know Lee does use the SGE backend himself. Not sure about CGA users, but he'd know.

LeeTL1220 commented 7 years ago

In DSP methods, quite a few use Cromwell+SGE. I'm not the only one.

Not really sure whether CGA uses Cromwell+SGE, but I highly doubt it.

On Mon, Aug 28, 2017 at 9:21 AM, ldgauthier notifications@github.com wrote:

In the MPG community, nobody's using Cromwell right now. I haven't used it with SGE myself. I know Lee does use the SGE backend himself. Not sure about CGA users, but he'd know.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/broadinstitute/cromwell/issues/1180#issuecomment-325351599, or mute the thread https://github.com/notifications/unsubscribe-auth/ACDXkxdK6oidifPO-tW1uYh7-o2VSafwks5scr70gaJpZM4JP0eM .

-- Lee Lichtenstein Broad Institute 75 Ames Street, Room 8011A Cambridge, MA 02142 617 714 8632

geoffjentry commented 7 years ago

There are multiple groups around the Broad who use this combo as well, or at least used to.