Ajc refactor - Githubissues

ambrosejcarr commented 6 years ago

This is a WIP PR to make cromwell-tools easier to develop, test, and use.

Major Changes:

Created two classes, CromwellAPI and CromwellAuth
CromwellAuth defines the different auth paths and yields a single container object that can be used for http requests.
CromwellAPI defines the (implemented) API calls and is directly called by the CLI (see _cli.py)
[Breaking] CromwellAPI.status now takes a single uuid as input and yields the status of a single workflow. I thought this should be changed because it is easy to loop over and doing so incurs no inefficiency. Implementing it in this way keeps it closer to the Cromwell API.

Minor Changes:

cromwell-tools is now an entrypoint instead of requiring an extra (untested) script. Code lives in _cli.py for easy testing.
endpoint urls are now defined in a single place (CromwellAPI) and are accessible so that they don't need to be re-defined in tests.

Need help: I've attempted to centralize all the code into a few classes. What I think this should allow is for direct testing of the classes by mocking the requests module. I'm not sure whether or not we need to define the object at the class level (see lines 22 & 23 in _cromwell_api.py). In any case, I haven't been able to get this to work over the weekend and could use some help if anyone is more familiar with mocks.

[ ] Added or updated tests
[ ] Updated documentation
[x] Applied style guidelines
[x] Considered generalizability beyond our own use case

Notes:

This PR contributes to Elastc ticket 324. When this gets merged, we will be able to add the rest of the Cromwell endpoints quite easily.
This could also be viewed as a contribution to the need for Cromwell "devtools".

ambrosejcarr commented 6 years ago

Thanks for the comments. The tests are all a mess and so I won't comment much on them. What I checked in doesn't work, and I was just trying out lots of stuff.

The issue I see with the tests is that there are some incompatibilities between the way they're written (unittest.TestCases) and the two things I really want:

Mocks
Parameterized tests (pytest)

That's why I was getting stuck.

ambrosejcarr commented 6 years ago

As discussed in person, this will not break any of our repos that are currently in production, because they use tagged versions. We've got all the time in the world to convert them over. :)

dshiga commented 6 years ago

Won't break things in production now, but leaves us unable to quickly switch to the latest cromwell-tools if needed for bug fixes, etc. I would prefer to wait on merging this until we have PRs ready in those other repos so they're not stuck.

ambrosejcarr commented 6 years ago

Works for me. I think some follow-up work is necessary to (1) port over the validate method and (2) implement testing.

We can use this as a feature branch and merge over when everything is ready.

rexwangcc commented 6 years ago

bump

ambrosejcarr commented 6 years ago

Since it came up on slack, there was a question about what this refactor buys us and I thought it was a good idea to leave my response here. Here are some thoughts:

Consistency in authentication -- all the different methods will hit the same Auth class and generate a single type of credential to work with cromwell.
Consistency between API and CLI by moving CLI code into the library. Right now there is CLI-only functionality
Testability. Little of the CLI-only code is tested, and tends to break (the addition of caas authentication broke a few things which are now fixed in master)
Extensibility. In its current form, the repo is difficult to add to because its unclear how to add Auth to new functions

More details on (2):

Ideally (at least in my philosophy) the library (python methods and classes) contain all the functionality of the repo.

The role of a CLI is just to expose that functionality to a user, and therefore should only pass parameters into library functions and return their outputs. Conversely, the library API should be free-standing and independent of the CLI. Basically you should be able to use either with the same commands and get the same result.

This has the advantage of only needing to test the library plus a simple no-op test for the CLI to make sure all argpase.add_argument() calls are legal. Things get weird, however, when extra functionality is in the CLI but not the library, and that's happening right now with authentication checking.

See e.g. this line: https://github.com/broadinstitute/cromwell-tools/blob/master/cromwell_tools/scripts/cromwell-tools#L9 (edited) similarly here: https://github.com/broadinstitute/cromwell-tools/blob/master/cromwell_tools/scripts/cromwell-tools#L64 (edited)

In contrast, other commands have auth being done in the API: https://github.com/broadinstitute/cromwell-tools/blob/master/cromwell_tools/cromwell_tools.py#L112 This is great because these functions work when you call them from the API. The ones in the above paragraph do not; they depend on the CLI for some of their functionality, which I think is a bit weird.

Finally, there are just a lot of auth commands -- when I was refactoring some functions took a caas key and other didn't. That may be fixed now, but the complexity of Auth here warrants an object, I think. We have: user + pass secrets file env variables caas key and now from #35: no-auth.

rexwangcc commented 6 years ago

bump^ As we have more functionalities relying on cromwell-tools, I really want to have features like Consistency in authentication and Extensibility. Any ideas about how we deal with this hanging PR? @kbergin

kbergin commented 6 years ago

Hi @rexwangcc I forget what was remaining on this effort. Do we have tickets that correspond to this @ambrosejcarr ?

kbergin commented 6 years ago

Looks like maybe it's waiting on tests per this ticket that I hadn't ported to new backlog yet: https://elastc.com/c/FppaG97L/295-cromwell-tools-testing

dshiga commented 6 years ago

I also wanted us to prepare PRs in our other repos that depend on cromwell-tools before merging this, so we would not have an extended period of time where the latest version would be unusable.

rexwangcc commented 6 years ago

@dshiga 's comments totally make sense to me. I just want to bump here and make sure we are tracking this. Since the more tools (for now, Lira, pipeline-tools, Falcon are all using cromwell-tools) we rely on cromwell-tools, the more code we write against its current API. If we keep pushing it back to refactor, I'm worrying that we will end up with either a BIG and expensive refactor among all of our repos or giving up this refactor ambrose contributed :(

kbergin commented 6 years ago

Excellent, made a new ticket to represent the remaining work. Depending on the level of churn caused by the efforts to get everything in the DCP in dev into integration, we may be able to pull this into next sprint as long as we also pull in some 10x/optimus work.

rexwangcc commented 6 years ago

Closing with https://github.com/broadinstitute/cromwell-tools/pull/43

broadinstitute / cromwell-tools

Ajc refactor #31