ACED-IDP / gen3_util

Collection of command line tools to interact with a Gen3 instance
MIT License
3 stars 1 forks source link

feature/git-lite #42

Closed bwalsh closed 9 months ago

bwalsh commented 10 months ago

This PR addresses #40, #29, #34:

This PR is one of a set:

See https://github.com/ACED-IDP/gen3_util/blob/feature/git-init/docs/git-lite-workflow.md

matthewpeterkort commented 10 months ago

Really great PR. Comments:

After running

gen3_util --profile dev init --project_id test-init_demo_dev
gen3_util access sign   

Noticed that g3t access ls returns the endpoint and nothing else. This could be confusing to the end user. Might be better for ls to show SIGNED requests too so that it's not just a blank screen.

Following the less is more git style, g3t buckets ls should probably be reduced to just g3t buckets, especially since it is the only sub-command.

gen3_util projects new command might not be needed since clone/init refactor. g3t clone and commit have the same description in the help menu.

running a g3t status after running

gen3_util --profile dev init --project_id test-init_demo_dev
gen3_util access sign   
g3t files add tests/fixtures/dir_to_study/file-2.csv

on development:

{'error': 'service failure - try again later'}
[2024-01-18 07:50:17,675] [ERROR] gen3_util.cli 'data'
Traceback (most recent call last):
  File "/Users/peterkor/Desktop/gen3_util/gen3_util/cli/cli.py", line 191, in status_cli
    output.update(status(config))
  File "/Users/peterkor/Desktop/gen3_util/gen3_util/cli/status.py", line 34, in status
    remote = counts(config, auth=auth)
  File "/Users/peterkor/Desktop/gen3_util/gen3_util/meta/lister.py", line 41, in counts
    records = submission_client.query(query)['data']
KeyError: 'data'
exception: '''data'''
msg: FAIL

Running g3t commit -m "added some files" after running above commands gives: Error: Missing argument 'STAGING_DIR'. This should probably initialized earlier on in the project, or at least the help message should tell the end user how to fix it. g3t doesn't have this argument in the commit step and it is a bit unnatural for the user, especially if they're just trying to commit some files and don't want anything to do with adding metadata.

Also, is there a way to bypass this argument entirely? That should probably be added. Docs need to get updated to support this command, because it isn't as intuitive like the other g3t commands.

I would suggest allowing or g3t commit -m " " and then a warning prompt that pops up saying something like "are you sure you want to commit files without attaching metadata." That way if the user is just following the git workflow they can somewhat blindly use this without reading too many docs.

Didn't set any environment variables. Simply ran: gen3_util --profile local init --project_id test-init_demo command. Wouldn't expect the end user to want to set environment variables either since you don't have to do that in git. I would argue that gen3_util --profile local init --project_id test-init_demo and similar clone commands should setup your project and profile ENV vars for all subsequent commands, and then after that if the user wants to switch between profiles, or projects they can manually change their environment variables. These env variables could be stored somewhere in the .g3t directory and be loaded into new envs between terminal sessions.

Would useful to publish/commit the use case that was demoed somewhere

bwalsh commented 10 months ago

Known issues: See https://github.com/ACED-IDP/gen3_util/issues/45

bwalsh commented 10 months ago

@matthewpeterkort thanks for the thoughtful review. I will comment in detail tomorrow. See https://github.com/ACED-IDP/aced-idp.github.io/pull/11 for documentation update

bwalsh commented 10 months ago

@matthewpeterkort summarized comments below. please check for accuracy.

matthewpeterkort commented 10 months ago
* [ ]  Would useful to publish/commit the use case that was demoed somewhere
  > @matthewpeterkort can you tell me more about what you meant by this one?

Was mostly concerned that there weren't any docs explaining all the updates but it looks like that got resolved here: https://github.com/ACED-IDP/aced-idp.github.io/pull/11/files.

After skimming through the docs commit it seems like the process of associating files with vertices so that the g3t can automatically generate some metadata is still a bit unclear. This process should have a section/page of its own, since the user is expected to manage the metadata creation process, they should know how to leverage the util to help them with this process.

Thinking about the commands that you had done in the demo that had told the util how to generate the metadata, that should be explained in the docs too.

bwalsh commented 10 months ago

https://github.com/ACED-IDP/aced-idp.github.io/pull/11/files

https://github.com/ACED-IDP/gen3_util/pull/42#issuecomment-1900967019

matthewpeterkort commented 10 months ago

ModuleNotFoundError: No module named 'wcmatch'. wcmatch needs to be added to requirements.txt

"g3t" or "gen3_util" by itself doesn't produce the --help screen anymore. This is not very git like. When you type git you get a list of commands when you type git help you also get a list of commands.

Git clone doesn't need an option or an environment variable to work.git clone [address of respository] just works. The format of the current g3t clone should be changed to be g3t clone [project_name]. In other words, the option parameter for project_id should be a required argument like it is in git.

End users will never have a local config. This error "no profile 'local' found in /Users/peterkor/.gen3/gen3_client_config.ini" makes sense to developers, but to an end user setting up a config for the first time, getting an error like this doesn't make any sense. The default config should be whatever the first config in the gen3_client_config.ini file is. That way end users will never have to specify a profile if they are only working in one gen3 instance. This --profile option seems like more of a developer option that is used to juggle multiple gen3 instances and should not be needed for most end users. Not very git like to make the end user have to set an environment variable just to use the utility. For example:

(venv) peterkor@RNB11238 gen3_util % g3t ping              
no profile 'local' found in /Users/peterkor/.gen3/gen3_client_config.ini, specify one of ['dev'], optionally set environmental variable: GEN3_UTIL_PROFILE
[2024-01-23 08:34:42,295] [ERROR] gen3_util.repo Invalid token type. Token must be a <class 'bytes'>
Traceback (most recent call last):
  File "/Users/peterkor/Desktop/gen3_util/gen3_util/repo/cli.py", line 79, in ping
    auth = ensure_auth(config=config, validate=True)
  File "/Users/peterkor/Desktop/gen3_util/gen3_util/config/__init__.py", line 118, in ensure_auth
    msg = key_expired_msg(key, key_name=profile, expiration_threshold_days=10)
  File "/Users/peterkor/Desktop/gen3_util/gen3_util/config/__init__.py", line 59, in key_expired_msg
    key = jwt.decode(api_key, options={"verify_signature": False})
  File "/Users/peterkor/Desktop/gen3_util/venv/lib/python3.9/site-packages/jwt/api_jwt.py", line 210, in decode
    decoded = self.decode_complete(
  File "/Users/peterkor/Desktop/gen3_util/venv/lib/python3.9/site-packages/jwt/api_jwt.py", line 151, in decode_complete
    decoded = api_jws.decode_complete(
  File "/Users/peterkor/Desktop/gen3_util/venv/lib/python3.9/site-packages/jwt/api_jws.py", line 198, in decode_complete
    payload, signing_input, header, signature = self._load(jwt)
  File "/Users/peterkor/Desktop/gen3_util/venv/lib/python3.9/site-packages/jwt/api_jws.py", line 254, in _load
    raise DecodeError(f"Invalid token type. Token must be a {bytes}")
jwt.exceptions.DecodeError: Invalid token type. Token must be a <class 'bytes'>
exception: Invalid token type. Token must be a <class 'bytes'>
msg: FAIL

Should not happen if the user only has 1 config in their gen3_client_config.ini. Also this error could probably exit a bit more gracefully.

matthewpeterkort commented 10 months ago

[2024-01-23 09:46:12,776] [ERROR] gen3_util.files.manifest indexd record already exists, consider using --duplicate_check. 59d23af8-4883-5bb7-a833-d07495fd2e00 409 Client Error: did "59d23af8-4883-5bb7-a833-d07495fd2e00" already exists for url: https://development.aced-idp.org/index/index/

but git push --help says

(venv) peterkor@RNB11238 ohsu-dev % g3t push --help
Usage: g3t push [OPTIONS]

Submit committed changes to commons. Options: --overwrite

So looks like aced_submission might need to be updated to have the error statement reccommend "--overwrite" instead of "--duplicate_check"

bwalsh commented 10 months ago

https://github.com/ACED-IDP/gen3_util/pull/42#issuecomment-1906406176

@matthewpeterkort init and ping now work without the need to specify profile (assuming only one profile exists in gen3_client)

See https://github.com/ACED-IDP/gen3_util/blob/feature/git-init/docs/git-lite-workflow.md#submitter-test-script-one-profile https://github.com/ACED-IDP/gen3_util/blob/feature/git-init/docs/git-lite-workflow.md#submitter-test-script-one-profile-of-many

bwalsh commented 10 months ago

https://github.com/ACED-IDP/gen3_util/pull/42#issuecomment-1906603328 overwrite now used @matthewpeterkort

bwalsh commented 10 months ago

https://github.com/ACED-IDP/gen3_util/pull/42#issuecomment-1906406176

wcmatch now in setup

bwalsh commented 10 months ago

addressed #46