ORNL / DataFed

A Federated Scientific Data Management System
https://ornl.github.io/DataFed/
Other
18 stars 14 forks source link

Improve DataFed setup configuration options #920

Open JoshuaSBrown opened 11 months ago

JoshuaSBrown commented 11 months ago

Description

Users have reported several problems with the datafed setup process when working with the python client. The setup step currently, is responsible for creating files in the users home directory in an invisible .datafed directory. These files include, the users public and private keys .pub and .priv as well as the core servers public key which is downloaded from the web server with an https request. Lastly, in the .datafed folder is a .ini file which contains the configuration the client will use.

Two problems will be addressed in this PR.

  1. More robust setup configuration - Meaning, the setup steps should work consistently, this means establishing good defaults and providing tools for wiping the configuration cleanly.
  2. More transparency - Meaning, providing methods and the means for showing the configuration that is being used, currently this can be confusing because the client could be using configuration containing both env variables and file configuration.

Implementation

NOTE: we are not proposing changing the default behavior of the command:

datafed setup

This command will remain the same, this command looks to see if public and private keys exist and if a configuration file exists. It will use the existing files if they do exist and only attempts do something if a file is missing so:

  1. If a .ini file is missing it will recreate the configuration file
  2. If the core server public key is missing it will attempt to redownload the .public key from the datafed web server
  3. If the the users public and private key pair are missing it will attempt to create a new key pair.

Good defaults and wiping configuration cleanly

Possible CLI implementations should not change the existing CLI - so the changes should be backwards compatible.

Feedback is wanted on what command line flags make the most sense?

For clearing or resetting the configuration to defaults and recreating the credentials and keys, possible flag names could be:

$ datafed setup --clear-cache
$ datafed setup --hard-reset
$ datafed setup --reset-all
$ datafed setup --reset

For just resetting the credentials (Meaning the public keys for the user and the core server as well as the users private key) but without changing the configuration (the .ini settings).

 $ datafed setup --reset-credentials

For just resetting the user credentials even if the public and private key for the user already exist.

$ datafed setup --reset-user-credentials

For resetting the public key of the core server. This would require redownloading the core server public key from the datafed web server over https. In the future, if the metadata server because federated and there are multiple metadata servers it will be important to be able to point to the correct metadata server to ensure the right public key is used (That assumes each metadata server has a different public key.

$ datafed setup --reset-server-credentials

For providing better transparency

To show the actual configuration that is being used when the datafed CLI is used it should be clear when env variables are used and or variables from the command line or from a configuration file. It should also be clear if a default is being used.

$ datafed setup --show

[client]
----------------------------------------------------------------------------------------------------------------------------------------
Type                        | Key/Path                                | Config Option    |  Content
----------------------------------------------------------------------------------------------------------------------------------------
env                         | CONFIG_FILE_PATH                        | config_file      | /home/bob/.datafed/datafed-client.ini
file                        | /home/bob/.datafed/datafed-client.ini   | config_dir       | /home/bob/.datafed
file                        | /home/bob/.datafed/datafed-client.ini   | public_key_file  | /home/bob/.datafed/datafed-user-key.pub
env                         | PRIVATE_KEY                             | private_key_file | /home/bob/.datafed/datafed-user-key.priv

[server]
----------------------------------------------------------------------------------------------------------------------------------------
Type       | Key/Path                                 | Config Option    | Content
----------------------------------------------------------------------------------------------------------------------------------------
file       |  /home/bob/.datafed/datafed-client.ini   | public_key_file  | /home/brownjs/.datafed/datafed-core-key.pub
file       |  /home/bob/.datafed/datafed-client.ini   | host             | datafed.ornl.gov
file       |  /home/bob/.datafed/datafed-client.ini   | port             | 7512

Show contents and path to the configuration file that is being used.

$ datafed setup --show-config-file

File:  /home/bob/.datafed/datafed-client.ini

[client]
config_dir = /home/bob/.datafed
config_file = /home/bob/.datafed/datafed-client.ini
public_key_file = /home/bob/.datafed/datafed-user-key.pub
private_key_file = /home/bob/.datafed/datafed-user-key.priv

[server]
public_key_file = /home/bob/.datafed/datafed-core-key.pub
host = datafed.ornl.gov
port = 7512

Expected Scope of Changes

This is expected to effect the following functions, which will likely need changes:

  1. def _setup command in the CLI.py file, this is where the CLI commands are defined, flags will need to be added to support the new features.
  2. def setupCredentials in the CommandLib.py file which needs an option for running a fresh credential setup instead of doing nothing if they exist.
  3. Config.py file there are defaults and env variables that are used in this file
def setupCredentials(self, overwrite=False):

In addition, several functions will need to be added in the CommandLib.py file to allow for the capabilities indicated previously.

def setupServerCredentials(self, overwrite=False):
   ...

def setupAllCredentials(self, overwrite=False):
   ...

def setupShowConfigFile(self):
   ...

def setupShow(self):
   ... 

Tasks

Work time

Expected time: 16 hours

jagar2 commented 11 months ago

This looks good. A few comments.

  1. I think you should provide some definitions of what each of these categories are, and what they are responsible for doing. I am unclear the true demarkation from user and server credentials.
  2. What would be the default option on datafed setup? I feel it should delete everything? or at least prompt the user if they want to delete their configuration.
  3. If it is not the default, I would recommend error handling that alerts the user that the configuration files are not correct.
JoshuaSBrown commented 11 months ago

For parts 1-3, I'm concerned about the behavior deviating from what it is currently doing, if I change the behavior of the setup command then it becomes a breaking change. I realize keeping "datafed setup" command the same is not an ideal solution, but I'm concerned with breaking backward compatibility. Let me discuss it a bit with Dale.

best,

PS Thanks for the feedback.

Joshua

From: Joshua C Agar @.> Sent: Tuesday, November 7, 2023 10:58 AM To: ORNL/DataFed @.> Cc: Brown, Joshua @.>; Assign @.> Subject: [EXTERNAL] Re: [ORNL/DataFed] Improve DataFed setup configuration options (PR #920)

This looks good. A few comments.

  1. I think you should provide some definitions of what each of these categories are, and what they are responsible for doing. I am unclear the true demarkation from user and server credentials.
  2. What would be the default option on datafed setup? I feel it should delete everything? or at least prompt the user if they want to delete their configuration.
  3. If it is not the default, I would recommend error handling that alerts the user that the configuration files are not correct.

- Reply to this email directly, view it on GitHubhttps://urldefense.us/v2/url?u=https-3A__github.com_ORNL_DataFed_pull_920-23issuecomment-2D1799019939&d=DwMCaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=EGSrZDU--J4wp05SUEemL_ukuVkVGS0SEKwYtlhw5Rs&m=5xamwhuIpIJqKjoW1UVKO9Kj3OJLYsBIPXPigAHwDAqXgv1i769V75_fkvrdwdW-&s=5G3g8Yw4iC-aTDM9Vp4HZqCgraiU6V4Ntkp2HP4Ghmk&e=, or unsubscribehttps://urldefense.us/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_ABDSZHEIP6AYAHA7KQZVIDLYDJK7NAVCNFSM6AAAAAA7BK45K6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOJZGAYTSOJTHE&d=DwMCaQ&c=v4IIwRuZAmwupIjowmMWUmLasxPEgYsgNI-O7C4ViYc&r=EGSrZDU--J4wp05SUEemL_ukuVkVGS0SEKwYtlhw5Rs&m=5xamwhuIpIJqKjoW1UVKO9Kj3OJLYsBIPXPigAHwDAqXgv1i769V75_fkvrdwdW-&s=LlSQX_8BWIZ5FVOG_OxXDw8E_z9I3jvCIC-ChTg1rC4&e=. You are receiving this because you were assigned.Message ID: @.***>