malariagen / malariagen-data-python

Analyse MalariaGEN data from Python
https://malariagen.github.io/malariagen-data-python/latest/
MIT License
13 stars 23 forks source link

Authenticate the user if accessing GCS #513

Closed alimanfoo closed 5 months ago

alimanfoo commented 5 months ago

This PR modifies the set up of a GCS file system to automatically authenticate the current user via google.auth.default(). This will use the application default credentials to authenticate the users.

If the user is running on colab, then google.colab.auth.authenticate_user() is run first to trigger colab's own mechanism for ensuring the current user is authenticated. This will open up some windows for the user to confirm they are happy to proceed, but otherwise the user doesn't have to do anything.

If the user is not on colab then they will have to run gcloud auth application-default login from the command line prior to using malariagen-data to ensure that application-default credentials are available.

codecov[bot] commented 5 months ago

Codecov Report

Attention: Patch coverage is 69.23077% with 4 lines in your changes are missing coverage. Please review.

:exclamation: No coverage uploaded for pull request base (master@fac3469). Click here to learn what that means.

Files Patch % Lines
malariagen_data/anoph/base.py 60.00% 4 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## master #513 +/- ## ========================================= Coverage ? 98.79% ========================================= Files ? 38 Lines ? 3664 Branches ? 0 ========================================= Hits ? 3620 Misses ? 44 Partials ? 0 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

review-notebook-app[bot] commented 5 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

alimanfoo commented 5 months ago

Hi @leehart, @ahernank, @cclarkson, this PR is ready for review. Happy to talk through in more detail. Basically once this is merged and included in a new release of malariagen_data, then anyone using malariagen_data inside google colab will be automatically authenticated. Anyone using it outside of google colab will need to run an authentication step.

This is something we can implement before making any changes to access policies on the actual storage. I.e., we can get users used to logging in first, and make sure the machinery is in place to do it. This means that we can then make changes to access policies on storage if needed.

alimanfoo commented 5 months ago

Would be good to add some documentation about authentication, both for users and for developers.