ohsu-comp-bio / euler

Authentication (authN) and high-level Authorization (authZ) for BMEG, Dirac and Search. Includes Swift object store.
MIT License
0 stars 0 forks source link

T16: client: ccc_client, dirac or swift ? #4

Open bwalsh opened 7 years ago

bwalsh commented 7 years ago

This sprint has a task that this issue addresses:

Task: "T16:There are three code bases that today address part of this story - openstack-swift, ccc_client & dirac. "

Done when: "A decision is made to enhance/fork/combine one of these client code bases.   Team consensus. POC?"

I've looked at the code bases:

Propose moving dirac to euler's api component POST /v0/files

image

@prismofeverything @kellrott @k1643 @mayfielg

Let me know your thoughts.

grmayfie commented 7 years ago

Based on comments in the Swift PR and my understanding of the Euler API, I would argue that the API is deprecating dirac, not moving it or utilizing it. Correct me if I'm wrong: the Euler /v0/files endpoint is designed to notice when a file is uploaded to swift via a plugin and automatically send a message to kafka, therefore acting as the upload server in Brian K's diagram (in addition to it's logon/logout duties which are not present on Brian K's diagram). I don't think I agree with your standpoint that this minimizes the amount of fresh code we write, since that leaves us without the user-friendly wrapper aspect already present in dirac which I mentioned in comments on the PR, which should exist. If we deprecate dirac as you propose, by utilizing the swift client flatly and moving the kafka messaging to the Euler API, we will still need to do that at some point.

Instead, I would propose a slightly different method. I think we all agree that using the swift client is desirable, based on the email thread:

Kyle’s teams face an analogous problem when giving file URLs to TESS. Their decision was to give the client the responsibility of knowing the storage URLs and whether the TESS workers would have access rights. Kyle makes a good point that the large sizes of the files means that there are unique requirements on the upload that have already been solved by the S3 client. So I think that letting the client talk to S3 may be best for our case.

First, we need to find out if the python swift module library has the same benefits as the swift command line client. If it doesn't, well then the point is moot, and I would support moving forward with BW's design plan outlined above. However, if they are both optimized as desired, then I think we are better off continuing with dirac as the command line tool and altering it to talk to the Euler API /v0/files endpoint, instead of kafka directly. Swift would still be cleanly accessible, but kafka would be safe behind an endpoint.

Both suggestions concur on using the /v0/files endpoint as a middleman for kafka. The suggestion difference is on the command line tool.

Thoughts?

bwalsh commented 7 years ago

good conversation. thanks.

moving vs deprecating.

First, we need to find out if the python swift module library has the same benefits as the swift command line client.

The user will need a variety of operations with the object store. e.g.

Dirac addresses upload. There swift library does indeed have other capabilities. However, they would all need to be wrapped in code that manages parameters, environmental variables, and makes the setup calls for service discovery, error checking ... etc. That's quite a bit of work to accomplish.

thanks for listening... -b