payu-org / payu

A workflow management tool for numerical models on the NCI computing systems
Apache License 2.0
20 stars 26 forks source link

Handling runlog remotes during a local clone #70

Closed marshallward closed 7 years ago

marshallward commented 7 years ago

I am testing out automatic pushing of output to a target repository on github, and I've noticed that our liberal git cloneing of others' configurations has created lots of origin remotes pointing to the origin directory, often the directories of other people to whom we probably do not want to be pushing changes.

Not exactly an "issue", but any thoughts on this situation? Do we want to wipe out the remotes in some way here? Or am I worrying too much about nothing?

(Mostly for @aidanheerdegen and @nicjhan )

marshallward commented 7 years ago

My current thinking is to just ignore the origin and create a payu remote pointing to github.

marshallward commented 7 years ago

Second problem: payu typically does its work on compute nodes, but compute nodes do not have internet access.

Will need to run this command prior to the qsub command (which may require a lot of shuffling about) or will need to submit a new job to some internet-accessible node (e.g. copyq).

I'm leaning towards the first option.

marshallward commented 7 years ago

FYI urllib blows, I am adding requests as a dependency.

marshallward commented 7 years ago

Current draft is in the github branch:

https://github.com/marshallward/payu/blob/github/payu/runlog.py

marshallward commented 7 years ago

payu push added to main branch.

More or less works using this config:

runlog:
    name: jra55
    username: marshallward
    organization: mxw900-raijin

Certificate issues were resovled using requests[security]. But there is still too much manual authentication going on here. Needs more work (and feedback!)

marshallward commented 7 years ago

Switching to ssh, there is no safe or sensible way I can see to do automatic authentication with https.

(So requests[security] may be unnecessary here, but will leave it for now)

marshallward commented 7 years ago

Three questions: 1) Is using passphraseless keys considered OK here? 2) Can we restrict the key to the target (organization/user)? 3) Do we dare to automatically generate a key? Seems dangerous but I worry it will otherwise never get used.

marshallward commented 7 years ago

Aidans suggestion of using Github's API Token works. Main question is where to save the token. The script he sent along suggests saving it inside of .gitconfig. Probably ok, right? Or not?

Also, it doesn't seem that the tokens (yet) provide fine-tuning access to specific repositories or organizations?

marshallward commented 7 years ago

Sorry... I see that the token is saved in .git/config, not the global-user config. That is probably safe enough.

marshallward commented 7 years ago

If we go ahead with this, then I think we need to be a bit careful with .git/config permissions. For example, will removing read access of config prevent cloning? And would this be stored in the public repo?

marshallward commented 7 years ago

Latter is not an issue: https://stackoverflow.com/questions/6547933/is-it-possible-to-clone-git-config-from-remote-location

(Aren't these messages fun? Where's the Unfollow button?)

aidanheerdegen commented 7 years ago

Sorry I haven't really engaged, bit busy today.

marshallward commented 7 years ago

Cloning is not possible without read access to .git/config, this seems to be a problem. We are essentially disabling clones within Raijin.

I am leaning towards use of personal ssh keys.

marshallward commented 7 years ago

After talking to James, it seems like API Tokens is definitely the way to go, and the ssh stuff should be phased out. The token (again, according to James) should probably be saved in a private file somewhere, rather than saved in the .git/config file.

marshallward commented 7 years ago

payu keygen added to main branch.

This command will generate a github API token and save it in one's home directory. This allows for passwordless creation of new repositories on one's designated target repository.

There are obvious security risks here. File permissions are set to values comparable to the .ssh directory. But I can probably do better here. More feedback is needed.

marshallward commented 7 years ago

payu keygen has been reworked in latest commit (f79fe7f).

It now does just about everything: Creats an organization (if needed), creates the repository, creates the remote, creates the keys, adds them as deploy keys for only that repository.

This will mangle your repo a bit, so check your remotes and look for ~/.ssh/id_rsa_payu[.pub] files.

payu push is now a very simple command that just enables the new key and pushes to the repo.

marshallward commented 7 years ago

I'm going to say this is "working", despite the many potential use case bugs, which can be handled in separate issues.