HumanCellAtlas / dcp-cli

DEPRECATED - HCA Data Coordination Platform Command Line Interface
https://hca.readthedocs.io/
MIT License
6 stars 8 forks source link

Problems accessing an alternate DSS from user scripts or unit tests #170

Open mikebaumann opened 6 years ago

mikebaumann commented 6 years ago

Setting the HCA python bindings to alternate DSS as currently documented:

config = HCAConfig()
config['DSSClient'].swagger_url = "https://dss.example.com/v1/swagger.json"
client = DSSClient(config=config)

works in contexts in which there is no preexisting HCA configuration, as is typically the case when the HCA python bindings are used within a deployed Lambda.

Problems arise, however, when setting the HCA python bindings to an alternate DSS in a user context, as when running scripts or unit tests that reference a specific DSS, and there is preexisting configuration for the DSS. Depending on the specific use case, it may be desirable to use Google user credentials or GOOGLE_APPLICATION_CREDENTIALS, yet the issue being raised here applies to both cases.

Here is a sample scenario:

  1. In this scenario, there is a preexisting HCA configuration file:~/.config/hca/config.json In this example, the config.json has the default values as would be populated by running hca dss login and selecting the user's credentials (or similarly running any hca command that requires authentication with GOOGLE_APPLICATION_CREDENTIALS configured, in which case the oauth2_token value would not be present in the following):
    {
    "log_level": "INFO",
    "client_id": "foo",
    "DSSClient": {
        "swagger_url": "https://dss.data.humancellatlas.org/v1/swagger.json"
    },
    "application_secrets": {
        "installed": {
            "client_id": "803155065464-8hgbqc16nve9hm78lqrvfbcpv5oh275q.apps.googleusercontent.com",
            "project_id": "hca-dcp-production",
            "auth_uri": "https://accounts.google.com/o/oauth2/auth",
            "token_uri": "https://accounts.google.com/o/oauth2/token",
            "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
            "client_secret": "YuUTUJxq55OhJsuG9AM1zFuJ",
            "redirect_uris": [
                "urn:ietf:wg:oauth:2.0:oob",
                "http://localhost"
            ]
        }
    },
    "oauth2_token": {
        "access_token": "ya29.Glv4BfRxF62nwBcyDAKrichVTS5vEcSU1IY8yJzKOOSM-n8noYpBaQePPbBb9D_CynwCvKl7Gr5Fcja_LEsSCdYvNZqG9wI-Nq7WVEY3hkBazvUUh7G2DWnqeM7x",
        "refresh_token": "1/iQ3kGl_IQluomKMkz0eLoy-uvhjMNqqpdwO7gXqCLEg",
        "id_token": "eyJhbGciOiJSUzI1NiIsImtpZCI6ImJhNGFlYWU4YjIwOGFkOWFlMTJiNjYxMDg2NWY2Mzk2MTI4N2I2ZDYifQ.eyJhenAiOiI4MDMxNTUwNjU0NjQtOGhnYnFjMTZudmU5aG03OGxxcnZmYmNwdjVvaDI3NXEuYXBwcy5nb29nbGV1c2VyY29udGVudC5jb20iLCJhdWQiOiI4MDMxNTUwNjU0NjQtOGhnYnFjMTZudmU5aG03OGxxcnZmYmNwdjVvaDI3NXEuYXBwcy5nb29nbGV1c2VyY29udGVudC5jb20iLCJzdWIiOiIxMTQyNDM4NDUxNjIxMDgxMTMxNjUiLCJoZCI6InVjc2MuZWR1IiwiZW1haWwiOiJtYmF1bWFubkB1Y3NjLmVkdSIsImVtYWlsX3ZlcmlmaWVkIjp0cnVlLCJhdF9oYXNoIjoib21faXBnMWRTc3phLXBtRUZVRlEtZyIsImV4cCI6MTUzNDIwMjI2MiwiaXNzIjoiYWNjb3VudHMuZ29vZ2xlLmNvbSIsImlhdCI6MTUzNDE5ODY2Mn0.LTOthTa1D2oUhuRjpP4zCflRtJPwFswDJcjqQno9UH2j2_5kQTuwaxu72TTvBoD3NjcQPzL8OFqbQfWctBOxVW_aEuflh-9dYZoG0cl2r46caklq4QPNKI9krBMufmtp9HQFb7Lyc0Ar41o2eegrx_qTN5v_bSleDojuPCMDrET7tzCGtUlXEv_NEhn4T18MiI_OndQx9u9iqFs8Zmjjawef6VC2DCo8Sc--Ktq3R8FTbs6LDvJ8ciQpE3KVzFOBwbfjIY5R5_TAfjl52jr2H2Wwh-ZLtBqv_78ZUtCl1ySQu2fgkSeSOwVBPs-njRWq6_2NEwGkbTdgzGF6MdPGHA",
        "expires_at": "-1",
        "token_type": "Bearer"
    }
    }
  2. To run a script or unit test with an alternate DSS, the script or test performs the currently specified initialization code:
    config = HCAConfig()
    config['DSSClient'].swagger_url = "https://dss.example.com/v1/swagger.json"
    client = DSSClient(config=config)

    Although not obvious from the code above, this will implicitly load existing HCA configuration (including application_secrets and (if present) oath2_token) then set a host value that is incompatible with these existing values. This will result in the following inconsistent configuration, as the host has changed but the application_secrets and oauth2_token) have not:

    {
    "log_level": "INFO",
    "client_id": "foo",
    "DSSClient": {
        "swagger_url": "https://dss.example.com/v1/swagger.json"
    },
    "application_secrets": {
        "installed": {
            "client_id": "803155065464-8hgbqc16nve9hm78lqrvfbcpv5oh275q.apps.googleusercontent.com",
            "project_id": "hca-dcp-production",
            "auth_uri": "https://accounts.google.com/o/oauth2/auth",
            "token_uri": "https://accounts.google.com/o/oauth2/token",
            "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
            "client_secret": "YuUTUJxq55OhJsuG9AM1zFuJ",
            "redirect_uris": [
                "urn:ietf:wg:oauth:2.0:oob",
                "http://localhost"
            ]
        }
    },
    "oauth2_token": {
        "access_token": "ya29.Glv4BfRxF62nwBcyDAKrichVTS5vEcSU1IY8yJzKOOSM-n8noYpBaQePPbBb9D_CynwCvKl7Gr5Fcja_LEsSCdYvNZqG9wI-Nq7WVEY3hkBazvUUh7G2DWnqeM7x",
        "refresh_token": "1/iQ3kGl_IQluomKMkz0eLoy-uvhjMNqqpdwO7gXqCLEg",
        "id_token": "eyJhbGciOiJSUzI1NiIsImtpZCI6ImJhNGFlYWU4YjIwOGFkOWFlMTJiNjYxMDg2NWY2Mzk2MTI4N2I2ZDYifQ.eyJhenAiOiI4MDMxNTUwNjU0NjQtOGhnYnFjMTZudmU5aG03OGxxcnZmYmNwdjVvaDI3NXEuYXBwcy5nb29nbGV1c2VyY29udGVudC5jb20iLCJhdWQiOiI4MDMxNTUwNjU0NjQtOGhnYnFjMTZudmU5aG03OGxxcnZmYmNwdjVvaDI3NXEuYXBwcy5nb29nbGV1c2VyY29udGVudC5jb20iLCJzdWIiOiIxMTQyNDM4NDUxNjIxMDgxMTMxNjUiLCJoZCI6InVjc2MuZWR1IiwiZW1haWwiOiJtYmF1bWFubkB1Y3NjLmVkdSIsImVtYWlsX3ZlcmlmaWVkIjp0cnVlLCJhdF9oYXNoIjoib21faXBnMWRTc3phLXBtRUZVRlEtZyIsImV4cCI6MTUzNDIwMjI2MiwiaXNzIjoiYWNjb3VudHMuZ29vZ2xlLmNvbSIsImlhdCI6MTUzNDE5ODY2Mn0.LTOthTa1D2oUhuRjpP4zCflRtJPwFswDJcjqQno9UH2j2_5kQTuwaxu72TTvBoD3NjcQPzL8OFqbQfWctBOxVW_aEuflh-9dYZoG0cl2r46caklq4QPNKI9krBMufmtp9HQFb7Lyc0Ar41o2eegrx_qTN5v_bSleDojuPCMDrET7tzCGtUlXEv_NEhn4T18MiI_OndQx9u9iqFs8Zmjjawef6VC2DCo8Sc--Ktq3R8FTbs6LDvJ8ciQpE3KVzFOBwbfjIY5R5_TAfjl52jr2H2Wwh-ZLtBqv_78ZUtCl1ySQu2fgkSeSOwVBPs-njRWq6_2NEwGkbTdgzGF6MdPGHA",
        "expires_at": "-1",
        "token_type": "Bearer"
    }
    }

    In addition, per the currently specified initialization code, this inconsistent configuration will be persisted to the current configuration file (~/.config/hca/config.json or file referenced by HCA_CONFIG_FILE) when the python program exits).

If using Google user credentials, these inconsistencies can be corrected by performing an hca dss login command again. Yet, if using GOOGLE_APPLICATION_CREDENTIALS, the inconsistent application_secrets currently appear to remain inconsistent.

mikebaumann commented 6 years ago

One possible approach to resolution is to provide a variation on the recommended initialization code:

config = HCAConfig()
config['DSSClient'].swagger_url = "https://dss.example.com/v1/swagger.json"
client = DSSClient(config=config)

in which the line:

config = HCAConfig()

provides a config with only the default configuration, as is obtained internally from: default_config.json, which consists of:

{
  "log_level": "INFO",
  "client_id": "foo",
  "DSSClient": {
    "swagger_url": "https://dss.data.humancellatlas.org/v1/swagger.json"
  }
}

In this case, potentially erroneous values for application_secrets and oauth2_token would not be implicitly loaded.

mikebaumann commented 6 years ago

Another approach is to allow provision of an alternate name for the configuration, in which case the existing "hca" configuration would not be implicitly loaded. Although tweak supports this, the current HCAConfig implementation prevents an alternate name from being specified.

    def __init__(self, *args, **kwargs):
        super(HCAConfig, self).__init__(name="hca", *args, **kwargs)

https://github.com/HumanCellAtlas/dcp-cli/blob/5945bc6383776c651fe6a950dfa145e8fee89e4a/hca/config.py#L9

This can be monkey patched around, as is being done for a temporary interim solution, here:

def monkey_patch_hca_config():
    HCAConfig.__init__ = HCAConfig.__bases__[0].__init__

https://github.com/DataBiosphere/cgp-dss-data-loader/blob/58add952296603b741b7585c63279b9eaa03af6d/util/__init__.py#L49