HEPCloud / decisionengine_modules

Apache License 2.0
2 stars 19 forks source link

Prepare code for new NERSC "superfacility" api #234

Closed StevenCTimm closed 7 months ago

StevenCTimm commented 4 years ago

SC Shane Canon Additional comments•2020-04-17 09:55:58 There is also the SuperFacility API which is starting to come along.

Steven,

Can you look at the APIs here...

https://api.nersc.gov/api/

And note where there are gaps in what you need? I'm not sure if all of these are functional yet and the API is still i alpha mode.


The above API will eventually replace the NEWT API making us have to redo all our NERSC code again.

DmitryLitvintsev commented 4 years ago

Interestingly that click on that link gives me 404 code. So new, it does not exist yet?

StevenCTimm commented 4 years ago

No I made a typo and clipped part of it off.

URL should be

https://api.nersc.gov/api/v1/

DmitryLitvintsev commented 4 years ago

looks like swagger RESTFul API documentation. Good. At least it is self described.

shreyb commented 4 years ago

So is this meant to replace the NERSC NEWT IRIS GraphQL(-ish) API?

StevenCTimm commented 4 years ago

The superfacility API will replace all of the NEWT part. I am not sure if it will replace the whole of IRIS or not. They have to do this because NEWT has old-style globus-gatekeeper still under the covers whereas the new superfacility API will use JWT tokens for authentication. (getting rid of the username/password auth which is a big thing).

Steve


From: shreyb notifications@github.com Sent: Thursday, July 2, 2020 10:29 AM To: HEPCloud/decisionengine_modules decisionengine_modules@noreply.github.com Cc: Steven C Timm timm@fnal.gov; Author author@noreply.github.com Subject: Re: [HEPCloud/decisionengine_modules] Prepare code for new NERSC "superfacility" api (#234)

So is this meant to replace the NERSC NEWT IRIS GraphQL(-ish) API?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_HEPCloud_decisionengine-5Fmodules_issues_234-23issuecomment-2D653074795&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=KjvjfzUzQU0JmlXZHvzBAiP8aULtsAjh34ZOqoFjwlI&s=rTxidDBxUXpBIV8LE_e4IKaekBpdFuUS0CafI4_Nhpk&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AGG4SOFU7W3CNQ4OOPYDMZDRZSRXHANCNFSM4NRCJX5Q&d=DwMCaQ&c=gRgGjJ3BkIsb5y6s49QqsA&r=10BCTK25QMgkMYibLRbpYg&m=KjvjfzUzQU0JmlXZHvzBAiP8aULtsAjh34ZOqoFjwlI&s=EbN280w99r69AIlT9Pm6us5oR62Nvm_CnYL1jN2mxiE&e=.

hyunwoo18 commented 3 years ago

I think I can take care of this development without any major hurdles because I am already familiar with how to generate a JWT token out of a secret key and how to make a request with a JWT token to google cloud.

In order to apply my knowledge to NERSC, I will need to have a secret key to start with. Any idea where I can get one? Should I wait until Steve returns from his current vacation which is through Thursday?

Anyway, once I have a secret key in hand, the following will be how I would proceed:

hyunwoo18 commented 3 years ago

Today, I submitted an NERSC application to create a new user account for myself. When this process is completed, I should be able to use my account to create my own JWT token. I will first test my own JWT token at https://api.nersc.gov/api/v1/ and then I will have to try to find a way to generate JWT tokens. And for that purpose, there should be two optional ways:

Method One

There can be an NERSC URL for token generation (auth URL)
and we should access this auth URL with username and password to have Nersc generate a JWT for us.

Method Two

Or, if NERSC is similar to GCP,
we will be given a secret (just like a secret that is embedded in the secret.json associated with GCP service account)
use this secret to calculate a JWT by ourselves in a python code

Either way, once we have a JWT, we use this JWT to directly access NERSC APIs (https://api.nersc.gov/api/v1/) or use this JWT to first access another NERSC auth URL (e.g. https://accounts.google.com/o/oauth2/token in GCP) to generate an access token and use this access token to access NERSC APIs (https://api.nersc.gov/api/v1/).

hyunwoo18 commented 3 years ago

From this pieces in https://api.nersc.gov/api/v1/swagger.json,

 "/accounting/projects": {
            "parameters": [
                {
                    "name": "Authorization",
                    "in": "header",
                    "type": "string",
                    "required": true,
                    "description": "JWT"
                }
            ],
   }

I can derive the following observations:

Once we have a JWT token in hand, we can use it in a similar way to how we access GCP APIs:

def get_headers( access_token ):
    headers['content-type'] = 'application/json'
    headers['Authorization'] = 'Bearer %s'%access_token
    return headers

So, this method (get_headers) will return a HTTP header and we will use this HTTP header in a call to requests.request as follows:

def get_zones( access_token, project_name = "hepcloud-fnal" ):
    full_url = "https://api.nersc.gov/api/v1/accounting/projects/"
    headers = get_headers( access_token )
    requests.request( 'GET', full_url, headers)

I said "similar" but (I believe) there is one big difference between GCP and Nersc API In GCP, we use JWT to access https://accounts.google.com/o/oauth2/token to get an access token and give this access token (now JWT) to

headers['Authorization'] = 'Bearer %s'%

But in Nersc, it appears that we give JWT directly to

headers['Authorization'] = 'Bearer %s'%

which is also allowed according to this document https://developers.google.com/identity/protocols/oauth2/service-account#jwt-auth

Addendum: Service account authorization without OAuth
With some Google APIs, you can make authorized API calls using a signed JWT directly as a bearer token, rather than an OAuth 2.0 access token.
When this is possible, you can avoid having to make a network request to Google's authorization server before making an API call.
hyunwoo18 commented 3 years ago

This is what I said in today's meeting:

NERSC web page for their NERSC REST API only shows how to access APIs
when we have a JSON Web Token
but it does not say how to acquire or generate a JWT.

Steve opened a ticket in NERSC service now and I asked some technical questions, but no response yet.
They only said we could generate our own JWT in their login portal called Iris,
so, yesterday, I submitted an application for creating a new user account for myself.
will have to wait a week or so.

In the meantime, I reviewed our uses of JWT in google cloud and openstack.
We know how to generate a JSON Web Token based on a secret key from google cloud
or we understand that we can contact their auth URL with username and password to acquire a JWT.

Like I said earlier, once we have a JWT in hand, it's a matter of passing it to NERSC REST APIs
in the Authorization HTTP Request Header.

Miscellaneous:

requests.request(method, url, **kwargs)   Constructs and sends a Request.

Parameters:
method – method for the new Request object: GET, OPTIONS, HEAD, POST, PUT, PATCH, or DELETE.
url – URL for the new Request object.

headers – (optional) Dictionary of HTTP Headers to send with the Request.
data    – (optional) Dictionary, list of tuples, bytes, or file-like object to send in the body of the Request.
params  – (optional) Dictionary, list of tuples or bytes to send in the query string for the Request.

Examples of using headers

def get_headers( access_token ):
    headers = {}
    headers['content-type'] = 'application/json'
    headers['Authorization'] = 'Bearer %s'%access_token
    return headers

def get_zone_instances( access_token, zone_name, project_name = "hepcloud-fnal" ):
    full_url = "https://www.googleapis.com/compute/v1/projects/%s/zones/%s/instances"%(project_name, zone_name)
    p_headers = get_headers( access_token )
    r = requests.request(method = 'GET', url=full_url, headers=p_headers, data=p_data, params=p_params)

Examples of using params

As an example, if you wanted to pass key1=value1 and key2=value2 to httpbin.org/get, you would use the following code:

payload = {'key1': 'value1', 'key2': 'value2'}
r = requests.get('https://httpbin.org/get', params=payload)

You can see that the URL has been correctly encoded by printing the URL:
print(r.url)
https://httpbin.org/get?key2=value2&key1=value1
hyunwoo18 commented 3 years ago

NERSC admin began responding to our questions and requests. He also gave us the right documents that we have been looking for (The docs were not public.)

Steve already tried to generate some access token as he already has his nersc account and my account is being processed pending Andrew Norman's approval.

Once my account is created, the NERSC admin will enable my access to the token page. Once this is all done, I will test accessing NERSC api with the appropriate token.

hyunwoo18 commented 3 years ago

This document describes the exactly same method of generating JWT as we do in GCP:

Step 1: Create two base64-encoded strings (one for head and one for body)

echo -n '{ "alg": "RS256" }' | openssl base64                  -A | tr '/+' '_-' | tr -d '=' > head.b64
                               openssl base64 -in payload.json -A | tr '/+' '_-' | tr -d '=' > body.b64

Step 2: Concatenate these two strings
cat head.b64 <(echo '.') body.b64 | tr -d "\n" > jwt.txt

Step 3: Sign this combined string
openssl dgst -sha256 -sign priv_key.pem jwt.txt | openssl base64 -A | tr '/+' '_-' | tr -d '='       >           sig.sha256.b64

Step 4: Finally concatenate the combined string from Step 2 and the signature from Step 3
ASSERTION=`cat jwt.txt <(echo '.') sig.sha256.b64 | tr -d "\n"`

And then send this JWT to their auth URL

Exchange the client assertion for an access token
Now you're ready to exchange the encrypted assertion for a short-lived access token:

curl -s -XPOST
-H "Content-Type:application/x-www-form-urlencoded"
-d "grant_type=client_credentials&client_assertion_type=urn%3Aietf%3Aparams%3Aoauth%3Aclient-assertion-type%3Ajwt-bearer&client_assertion=$ASSERTION"
https://oidc.nersc.gov/c2id/token

This is exactly how I expected..

hyunwoo18 commented 3 years ago

Forgot to mention they also support this process by using "authlib" python library which I will be using:

from authlib.integrations.requests_client import OAuth2Session
from authlib.oauth2.rfc7523 import PrivateKeyJWT

token_url = "https://oidc.nersc.gov/c2id/token"
client_id = "<your client id>"
private_key = "<your private key>"

client = OAuth2Session(client_id=client_id,
                        client_secret=private_key, 
                        token_endpoint_auth_method="private_key_jwt")
client.register_client_auth_method(PrivateKeyJWT(token_url))
resp = client.fetch_token(token_url, grant_type="client_credentials")
token = resp["access_token"]
hyunwoo18 commented 3 years ago

My Nersc user account was approved last Friday and I asked the nersc admin who whitelisted Steve in the Iris profile to do the same thing for me so I can also generate a JWT token for my tests.

hyunwoo18 commented 3 years ago

Until we have a "group" account prepared for our usage, I am testing Nersc API with my own credential:

First, I run the following script with my client credential and private key

$ cat ./myaccesstoken.sh
#!/bin/bash
#date -v +120M "+%s"
#update payload.json
openssl base64 -in payload.json -A | tr '/+' '_-' | tr -d '=' > body.b64
cat head.b64 <(echo '.') body.b64 | tr -d "\n" > jwt.txt
openssl dgst -sha256 -sign mynerscpriv.pem jwt.txt | openssl base64 -A | tr '/+' '_-' | tr -d '=' > sig.sha256.b64
ASSERTION=`cat jwt.txt <(echo '.') sig.sha256.b64 | tr -d "\n"`
curl -s -XPOST -H "Content-Type:application/x-www-form-urlencoded" -d "grant_type=client_credentials&client_assertion_type=urn%3Aietf%3Aparams%3Aoauth%3Aclient-assertion-type%3Ajwt-bearer&client_assertion=$ASSERTION"  https://oidc.nersc.gov/c2id/token

This last step, the curl command, will print a new access token. I can use this access token in the following simple testing script

def get_headers( access_token ):
    headers = {}
    headers['content-type'] = 'application/x-www-form-urlencoded'
    headers['Authorization'] = access_token
    return headers

my_access_token = "..."
my_header = get_headers( my_access_token )

p_url = 'https://api.nersc.gov/api/v1/accounting/projects'

r = requests.request(method = 'GET', url=p_url, headers=my_header)
returndict = json.loads(r.text)
pprint( returndict )

It seems to work: i.e. I am getting

[{'description': 'Enabling HEP Intensity Frontier Science through HEPCloud',
  'hours_given': 25344000000.0,
  'hours_used': 0.0,
  'hpss_usage': None,
  'id': 63322,
  'iris_role': None,
  'projdir_usage': [{'bytes_given': 1099511627776.0,
                     'bytes_given_human': '1.00 TB',
                     'bytes_used': 0.0,
                     'bytes_used_human': None,
                     'files_given': 20000000.0,
                     'files_used': 1.0,
                     'name': 'dunepro'},
                    {'bytes_given': 1099511627776.0,
                     'bytes_given_human': '1.00 TB',
                     'bytes_used': 0.0,
                     'bytes_used_human': None,
                     'files_given': 20000000.0,
                     'files_used': 1.0,
                     'name': 'fife'},
                    {'bytes_given': 18691697672192.0,
                     'bytes_given_human': '17.00 TB',
                     'bytes_used': 15015051264.0,
                     'bytes_used_human': '13.98 GB',
                     'files_given': 20000000.0,
                     'files_used': 72935.0,
                     'name': 'm3249'},
                    {'bytes_given': 1099511627776.0,
                     'bytes_given_human': '1.00 TB',
                     'bytes_used': 0.0,
                     'bytes_used_human': None,
                     'files_given': 20000000.0,
                     'files_used': 1.0,
                     'name': 'nova'}],
  'repo_name': 'm3249'}]

More later..

hyunwoo18 commented 3 years ago

I will go through our current DE Nersc codes and identify which NEWT-based calls need to be replaced by new APIs and then will start updating the actual codes soon.

hyunwoo18 commented 3 years ago

Here is summarization of my investigation of the current NERSC codes in de_modules.

Currently NERSC/util/newt.py has 3 methods

get_usage
get_status
get_queue

get_usage is called only from NerscAllocationInfo.py and get_status and get_queue are called only from NerscJobInfo.py.

Question: Are we loading NerscJobInfo.py anywhere?

[root@hepcsvc03 config.d]# cd /etc/decisionengine/config.d/
[root@hepcsvc03 config.d]# grep NerscJobInfo Nersc.jsonnet 
[root@hepcsvc03 config.d]# 

Anyway I am summarizing current API URLs for each of get_usage, get_status, and get_queue: get_usage call:

https://newt.nersc.gov/newt//account/iris
with this data in the body
accounts(username:\"uscms\"){projectId, repoName, repoType, currentAlloc, usedAlloc, 
users{uid, name, firstname, lastname, middlename, userAlloc, userAllocPct, usedAlloc}}

Return:
{u'data': {u'newt': {u'accounts': [{u'currentAlloc': 253440000000.0,
                                    u'projectId': 63322,
                                    u'repoName': u'm3249',
                                    u'repoType': u'REPO',
                                    u'usedAlloc': 175631251940.0,
                                    u'users': [{u'firstname': u'fife',
                                                u'lastname': u'Pseudo User',
                                                u'middlename': u'',
                                                u'name': u'fife',
                                                u'uid': 79226,
                                                u'usedAlloc': 0.0,
                                                u'userAlloc': 0.0,
                                                u'userAllocPct': 10.0}]}]}}}

get_status call:

https://newt.nersc.gov/newt/status/

Return:
[
{u'status': u'up', u'system': u'cori'},
{u'status': u'up', u'system': u'edison'},
{u'status': u'up', u'system': u'pdsf'},
{u'status': u'up', u'system': u'genepool'},
{u'status': u'up', u'system': u'archive'}
]

get_queue

https://newt.nersc.gov/newt/queue/edison/

But request is haning

In conclusion, for each URL, I will identify and test the corresponding URL from the new SuperFacility API.

StevenCTimm commented 3 years ago

Comments above: (1) we do not run NerscJobInfo.py inside the decision engine right now, largely because the NEWT API was not reliable and it kept on hanging. NerscJobInfo.py does run in the external monitoring SW that runs on hepcsvc01, which also needs to be converted to use the superficiality API.

(2) get_queue needs to be modified because the Edison machine doesn't exist anymore, it right now only uses cori --but the queues should be adjustable by configuration for whatever machines exist.. a new machine "Perlmutter" is coming pretty soon.

hyunwoo18 commented 3 years ago

Thanks for the reply.

Re: (2), I just ran again against 'cori' only this time and got a list of all the jobs running there!! We will need to find a correct way (in the new SF API) to request a list of jobs only from our projects.

hyunwoo18 commented 3 years ago

I resumed working on this issue today. And I discovered a seemingly big difference between the current newt API and the SF API.

Let's limit this discussion to get_usage. Currently (when using the newt API), we load username and password from /etc/gwms-frontend/credentials/nersc_newt file and send POST request to the URL https://newt.nersc.gov/newt/login with data field of this credential. The username used here is G. Cooper. And then when calling get_usage with this authentication, we contact https://newt.nersc.gov/newt//account/iris with this in the data field of the POST request accounts(username:\"uscms\") when we want usage data of the user "uscms" or "fife"

In short, newt API allows us to specify usage of which user we want to receive.

But my understanding of the new SF API is the only piece that I can find in the swagger.json that is equivalent to this request is https://api.nersc.gov/api/v1/accounting/projects When I call this URL, I just attach the access token generated from my credential. The return data from this URL is the usage of my account, In short, the SF API only returns the usage of the user that owns the access token that is used in https://api.nersc.gov/api/v1/accounting/projects.

I conclude for now that this observation seems to indicate that in order to acquire the usage data of the user uscms(or fife), we need the credential of uscms (or fife) user and generate an access token from that.

StevenCTimm commented 3 years ago

The numbers we are really interested in are the usage of the full repos (m2612, m3249) and of various individual users within them. We should file a ticket with NERSC about this. We should in principle be able to generate an access token from the uscms or other group accounts if we get it whitelisted but that may not solve the whole problem. I will look at the API documentation and see if I can find anything.

hyunwoo18 commented 3 years ago

I submitted a question to NERSC Support with the following contents: INC0162095

Hi Nersc Admins,

Please redirect this ticket to Shane Canon.

I work in Fermilab computing division with Steve Timm
and I am working on updating our newt-based nersc monitoring codes with your new SuperFacility API.

I might need to ask a series of questions but 
let me start with a simple one here:

Currently we are using the NEWT API in the following way:
I have my own credentials (username and password).
With these, I first create a "newt cookie"
and then use this cookie file to access the following URL to acquire some information:
curl -k -b /root/newt_cookies.txt -X GET https://newt.nersc.gov/newt/account/usage/repo/m2612/users
or
curl -k -b /root/newt_cookies.txt -X GET https://newt.nersc.gov/newt/account/usage/repo/m2612/

Now, I am trying to switch to the new SF API
and find equivalent ways to acquire the same information on https://api.nersc.gov/api/v1/
I am experimenting with the following APIs

https://api.nersc.gov/api/v1/accounting/projects
https://api.nersc.gov/api/v1/accounting/projects/m3249/jobs
https://api.nersc.gov/api/v1/accounting/roles

But it looks like none of these SF API are giving me the same information as from NEWT

So, could you please guide me through how to map current NEWT APIs to the new SF APIs?

Thanks,
HyunWoo KIM
hyunwoo18 commented 3 years ago

I am resuming working on this issue as of today. I first reviewed what I did in November and here is the summary of how we can generate access token and use it to access SF APIs

Go to iris.nersc.gov and Sign in

Click on "Profile" in the drop-down menu from "hyunwoo" button in the top-right corner

Locate "Superfacility API Clients" in the middle of the page and click on "New Client" on the right side.

In the pop-up page, fill in required information and you will see

New Client Id # this will be used in payload.json

And create myprivate.pem with the copy-pasted RSA Private Key contents

Now run

date -v +120M "+%s"
(The result will be used in payload.json below)

Edit myaccesstoken.sh with the new myprivate.pem

#!/bin/bash

#date -v +120M "+%s"
#update payload.json

rm -f body.b64
openssl base64 -in payload.json -A | tr '/+' '_-' | tr -d '=' > body.b64

rm -f jwt.txt
cat head.b64 <(echo '.') body.b64 | tr -d "\n" > jwt.txt

rm -f sig.sha256.b64
openssl dgst -sha256 -sign myprivate.pem jwt.txt | openssl base64 -A | tr '/+' '_-' | tr -d '=' > sig.sha256.b64

ASSERTION=`cat jwt.txt <(echo '.') sig.sha256.b64 | tr -d "\n"`

curl -s -XPOST -H "Content-Type:application/x-www-form-urlencoded" -d "grant_type=client_credentials&client_assertion_type=urn%3Aietf%3Aparams%3Aoauth%3Aclient-assertio\
n-type%3Ajwt-bearer&client_assertion=$ASSERTION"  https://oidc.nersc.gov/c2id/token

Run

./myaccesstoken.sh

and that will print the new access token string

Put the access token in the requests_nersc.py in the same local directory

import requests
import json
from pprint import pprint

def get_headers( access_token ):
    headers = {}
    headers['content-type'] = 'application/x-www-form-urlencoded'
    headers['Authorization'] = 'Bearer %s'%access_token
    return headers

def get_headers2( access_token ):
    headers = {}
    headers['content-type'] = 'application/x-www-form-urlencoded'
    headers['Authorization'] = access_token
    return headers

my_access_token = "new access token string here"

my_header = get_headers2( my_access_token )
p_url = 'https://api.nersc.gov/api/v1/accounting/projects/m3249/jobs'
r = requests.request(method = 'GET', url=p_url, headers=my_header)
returndict = json.loads(r.text)
pprint( returndict )

And finally run

python requests_nersc.py
https://api.nersc.gov/api/v1/

is supposed to show us all the details of SF API.

hyunwoo18 commented 3 years ago

This week, I worked on the second part of this issue, namely, exploring SF API with the generated access token. I believe their newest API version is https://api.nersc.gov/api/v1.2/ and there I see two APIs that were not available in v1.0 from November /account/projects/ /account/projects/{repo_name}/jobs none of which seems to return results that are equivalent to result from accessing a newt URL, https://newt.nersc.gov/newt/account/iris this is being used in NerscAllocationInfo.py in decisionengine_modules/NERSC/sources

I opened a new Nersc service ticket and asked a follow up question INC0165033 Steve is cc'ed there. Let's see how they respond to this new ticket.

hyunwoo18 commented 3 years ago

I am reviewing current DE Nersc code one more time

First, NerscAllocationInfo.py

decisionengine_modules/NERSC/sources/NerscAllocationInfo.py
def acquire => def send_query

for username in self.constraints.get("usernames", []):
     values = self.newt.get_usage(username)

This calls

https://newt.nersc.gov/newt/account/iris

The response from Nersc is

[
 {u'currentAlloc': 0.0,
  u'firstname': u'uscms',
  u'lastname': u'Pseudo User',
  u'middlename': u'',
  u'name': u'uscms',
  u'projectId': 67263,
  u'repoName': u'm3651_g',
  u'repoType': u'REPO',
  u'uid': 76521,
  u'usedAlloc': 0.0,
  u'userAlloc': 0.0,
  u'userAllocPct': 100.0
},

 {u'currentAlloc': 36007326000.0,
  u'firstname': u'uscms',
  u'lastname': u'Pseudo User',
  u'middlename': u'',
  u'name': u'uscms',
  u'projectId': 67263,
  u'repoName': u'm3651',
  u'repoType': u'REPO',
  u'uid': 76521,
  u'usedAlloc': 0.0,
  u'userAlloc': 0.0,
  u'userAllocPct': 100.0
},

 {u'currentAlloc': 378000000000.0,
  u'firstname': u'uscms',
  u'lastname': u'Pseudo User',
  u'middlename': u'',
  u'name': u'uscms',
  u'projectId': 54807,
  u'repoName': u'm2612',
  u'repoType': u'REPO',
  u'uid': 76521,
  u'usedAlloc': 54974735223.0,
  u'userAlloc': 0.0,
  u'userAllocPct': 100.0
},

 {u'currentAlloc': 0.0,
  u'firstname': u'uscms',
  u'lastname': u'Pseudo User',
  u'middlename': u'',
  u'name': u'uscms',
  u'projectId': 54807,
  u'repoName': u'm2612_g',
  u'repoType': u'REPO',
  u'uid': 76521,
  u'usedAlloc': 0.0,
  u'userAlloc': 0.0,
  u'userAllocPct': 100.0
},

 {u'currentAlloc': 0.0,
  u'firstname': u'fife',
  u'lastname': u'Pseudo User',
  u'middlename': u'',
  u'name': u'fife',
  u'projectId': 63322,
  u'repoName': u'm3249_g',
  u'repoType': u'REPO',
  u'uid': 79226,
  u'usedAlloc': 0.0,
  u'userAlloc': 0.0,
  u'userAllocPct': 100.0
},

 {u'currentAlloc': 270000000000.0,
  u'firstname': u'fife',
  u'lastname': u'Pseudo User',
  u'middlename': u'',
  u'name': u'fife',
  u'projectId': 63322,
  u'repoName': u'm3249',
  u'repoType': u'REPO',
  u'uid': 79226,
  u'usedAlloc': 0.0,
  u'userAlloc': 3600000000.0,
  u'userAllocPct': 10.0
}

]

Next NerscJobInfo.py

decisionengine_modules/NERSC/sources/NerscJobInfo.py
    def acquire(self):
        up_machines = [x for x in self.newt.get_status() if x['status'] == 'up']

This calls

https://newt.nersc.gov/newt/status/

And the response is

[
{u'status': u'up', u'system': u'cori'},
{u'status': u'up', u'system': u'edison'},
{u'status': u'up', u'system': u'pdsf'},
{u'status': u'up', u'system': u'genepool'},
{u'status': u'up', u'system': u'archive'}
]

Note that similar info can be obtained by using the following SF API

https://api.nersc.gov/api/v1.2/status

And second part of NerscJobInfo.py has

    for m in machines:
            values = self.newt.get_queue(m)

which ends up calling

https://newt.nersc.gov/newt/queue/edison/

But request is haning

hyunwoo18 commented 3 years ago

I updated Nersc ticket asking the following follow-up questions:

Hi Bjoern,

Currently we obtain usage information by calling https://newt.nersc.gov/newt/account/iris
and an example response looks like:

{u'currentAlloc': 270000000000.0,
u'firstname': u'fife',
u'lastname': u'Pseudo User',
u'middlename': u'',
u'name': u'fife',
u'projectId': 63322,
u'repoName': u'm3249',
u'repoType': u'REPO',
u'uid': 79226,
u'usedAlloc': 0.0,
u'userAlloc': 3600000000.0,
u'userAllocPct': 10.0
}

Currently by accessing https://newt.nersc.gov/newt/account/iris,
we can obtain this usage info for individual users (fife,dunepro,gm2,uscms).
And of this data structure, we are mostly interested in "currentAlloc and usedAlloc".

So, our first set of questions should be

Which SuperFacility API (https://api.nersc.gov/api/v1.2/) would provide the same information
(again we are mostly interested in "currentAlloc and usedAlloc")?

If the current version of SuperFacility API does not provide this type of information yet,
do you have future plan to do so?

Or if you do not plan to provide this information in SF API, can we continue
to access https://newt.nersc.gov/newt/account/iris?

Another question would be this:
When I access
https://api.nersc.gov/api/v1.2/account/projects
I am getting
<Response [403]>
{'message': "You don't have the permission to access the requested resource. "
'It is either read-protected or not readable by the server.'}

and when I access
https://api.nersc.gov/api/v1.2/account/projects/m2612(or m3249)/jobs
I am getting
<Response [200]>

Is this because I am using my personal Nersc Client credential?
What action is needed for me to get some meaningful information from
https://api.nersc.gov/api/v1.2/account/projects
https://api.nersc.gov/api/v1.2/account/projects/{repo_name}/jobs
?
Thanks very much!
HyunWoo
hyunwoo18 commented 3 years ago

This is a working Python code that I have written based on my tests so far. It assumes that we have downloaded Superfacility API cliient ID and private key (pem). With this information, it generates a JWT first and contacts a NERSC URL to retrieve an access token. Then, use this access token to access regular Superfacility APIs.

import requests
import json
from pprint import pprint

from authlib.integrations.requests_client import OAuth2Session
from authlib.oauth2.rfc7523               import PrivateKeyJWT
import pem

token_url = "https://oidc.nersc.gov/c2id/token"
p_url = 'https://api.nersc.gov/api/v1.2/account/projects'
client_id = "xxxxx"

def get_access_token():
    certs = pem.parse_file( "uscms-key.pem" )
    private_key = str(certs[0])

    client = OAuth2Session(  client_id=client_id,  client_secret=private_key,   token_endpoint_auth_method="private_key_jwt"  )
    client.register_client_auth_method(  PrivateKeyJWT( token_url )  )
    resp = client.fetch_token( token_url, grant_type="client_credentials" )
    token = resp[ "access_token" ]
    return token

def get_headers2( access_token ):
    headers = {}
    headers['content-type'] = 'application/x-www-form-urlencoded'
    headers['Authorization'] = access_token
    return headers

my_access_token = get_access_token()
my_header = get_headers2( my_access_token )

r = requests.request(method = 'GET', url=p_url, headers=my_header)

returndict = json.loads( r.text )
pprint( returndict )

I will identify an appropriate place in the decision engine code to put this code.

Some restrictions:

  1. we need to repeat this process for each of the users in a project. The above code retrieves accounting info for a pseudo user named uscms. For another user such as fife, we will need its own private key and Client ID. Acquiring each user's private key can be done by any user in a project.

  2. For now, the lifetime of private keys are 30 days. We put in our request for extending 30 days to several months so that we don't need to repeat the process of generating private keys every month.

  3. We also requested that they provide a way to get the total accounting info for a project.

mambelli commented 2 years ago

@hyunwoo18 Any follow-up on this?

hyunwoo18 commented 2 years ago

I began actually coding in fermicloud571 (my new DE dev machine) I am doing some testing and I will update more in a couple of days.

hyunwoo18 commented 2 years ago

Here is my current testing version of the code:

[root@fermicloud571 sources]# pwd
/usr/lib/python3.6/site-packages/decisionengine_modules/NERSC/sources
[root@fermicloud571 sources]# cat NerscAllocationInfo.py
...........
# new imports
import requests
import json
from prometheus_client import Gauge

my_access_token = ..............

sourcenop_test_values = Gauge("sourcenop_test_values", "Test metric", labelnames=["key1"], multiprocess_mode="liveall")

def get_headers2( access_token ):
    headers = {}
    headers['content-type'] = 'application/x-www-form-urlencoded'
    headers['Authorization'] = access_token
    return headers

my_header = get_headers2( my_access_token )
p_url = 'https://api.nersc.gov/api/v1.2/account/projects'

@Source.produces(foo=pd.DataFrame)
class NerscAllocationInfo(Source.Source):

    def __init__(self, config):
        super().__init__(config)

        self.constraints = config.get('constraints')
........
        self.max_retries          = config.get("max_retries", _MAX_RETRIES)
        self.retry_backoff_factor = config.get("retry_backoff_factor",  _RETRY_BACKOFF_FACTOR)
#        self.newt = newt.Newt( config.get("passwd_file").....
        self.logger = self.logger.bind(class_module=__name__.split(".")[-1], )

    def get_projects(self, p_url):
        r = requests.request(method = 'GET', url=p_url, headers=my_header)
        returndict = json.loads(r.text)
        return returndict

    def send_query(self):
        results = []
        for username in self.constraints.get("usernames", []):
#            values = self.newt.get_usage(username)
            values = self.get_projects( 'https://api.nersc.gov/api/v1.2/account/projects' )  # self.logger.debug(f"= {values}" )

            for eachX in values:
                newEntry = {}
                newEntry['currentAlloc'] = eachX['project_hours_given']
                newEntry['usedAlloc']    = eachX['project_hours_used']
                newEntry['userAlloc']    = eachX['hours_given']
                newEntry['userAllocPct'] = (eachX['hours_given'] / eachX['project_hours_given']) * 100
                newEntry['firstname'] = 'uscms'
                newEntry['lastname'] = 'Pseudo User'
                newEntry['middlename'] = ''
                newEntry['name'] ='uscms'
                newEntry['projectId'] = 54807
                newEntry['repoName'] = eachX['repo_name']
                newEntry['repoType'] = 'REPO'
                newEntry['uid'] = 76521
                results.append( newEntry )

        newt_keys = self.constraints.get("newt_keys", {})

        for key, values in newt_keys.items():
            k = key
            if key == 'rname':
                k = 'repoName'
            if key == 'repo_type':
                k = 'repoType'
            if values:
                results = [x for x in results if x[k] in values]

        return results

    def acquire(self):

        self.send_query()

## temp begins: 
## HK> This section was introduced only for testing purpose
## because I am only copying the source from Nersc.jsonnet. The rest is from test_channel.jsonnet
        result = {
            "foo": pd.DataFrame(
                [
                    {"key1": "value1", "key2": 0.1},
                    {"key1": "value2", "key2": 2},
                    {"key1": "value3", "key2": "Test"},
                ]
            )
        }
        for _, row in result["foo"].iterrows():
            if isinstance(row["key2"], (float, int)):
                sourcenop_test_values.labels(row["key1"]).set(row["key2"])

        return result
## temp ends

Source.describe(NerscAllocationInfo)
StevenCTimm commented 2 years ago

Looks like you are going in the right direction.. have you been able to get it started up and see what de-client --print-product Nersc_Allocation_Info looks like?

hyunwoo18 commented 2 years ago

Steve, okay, I will see what I get from that command soon.

hyunwoo18 commented 2 years ago

Added a new section of code to sources/NerscAllocationInfo.py that loads a private key (from Nersc iris), contacts their auth server and finally downloads a fresh access token. I can test this access token to the SF API but the API server seems to be down the entire day today.

For this new code to work, I had to install 2 new pythong modules:

python3 -m pip install authlib
python3 -m pip install pem
python3 -m pip install PyJWT
hyunwoo18 commented 2 years ago

I submitted a new ticket with NERSC (INC0181426) to ask about how to set expiration for access tickets.

hyunwoo18 commented 2 years ago

Okay, there was some progress today.

I put all of SF API token related code in a new file called check_and_get.py (can be renamed later) and put it in NERSC/util/ directory. The main file NerscAllocationInfo.py loads it like

from decisionengine_modules.NERSC.util import check_and_get

and uses it as follows;

class NerscAllocationInfo(Source.Source):
    def __init__(self, config):
............
HK> New method
    def get_projects(self, p_url, atoken):
        my_header = check_and_get.get_headers2( atoken )
        r = requests.request(method = 'GET', url=p_url, headers=my_header)
        returndict = json.loads(r.text)
        return returndict

    def send_query(self):
        results = []
        for username in self.constraints.get("usernames", []):
            atoken = check_and_get.check_accesstoken( username )
            values = self.get_projects( check_and_get.p_url, atoken ) 

The new file, check_and_get.py is:

def check_accesstoken( nersc_user ):

    currenttime = time.time()
    renew_bool = False

    rawfile_ucms = '/tmp/ucms_access.token'
    rawfile_fife = '/tmp/fife_access.token'

    pemfile_ucms = '/tmp/ucms-fnal.pem'
    pemfile_fife = '/tmp/fife-fnal.pem'

    token_url = "https://oidc.nersc.gov/c2id/token"

    client_id_ucms = "5o.........."
    client_id_fife = "i4........."

    rawfile = None
    pemfile = None
    client_id = None

    if nersc_user == 'uscms':
        rawfile = rawfile_ucms
        pemfile = pemfile_ucms
        client_id = client_id_ucms
    elif nersc_user == 'fife':
        rawfile = rawfile_fife
        pemfile = pemfile_fife
        client_id = client_id_fife
    else:
        print( "Unknown user, exiting" )
        return None

# HK> If an access token does not exist, we have to generate anyway.
    if not os.path.exists( rawfile ):
        print(f"{rawfile} does not exist. Need to generate")
        renew_bool = True
# HK>  If one does exist, we check the expiration
    else:
        atoken = None
        with open( rawfile, 'r' ) as afile:
            atoken = afile.read( )
            atoken = atoken.rstrip()#        print( atoken )

# Check the expiration
        try:
            result = jwt.decode(atoken, options={"verify_signature": False}) #         pprint( result )
#HK> If the access token is expired, the flow goes directly to except jwt.ExpiredSignatureError
            return atoken # This means the existing access token is not expired.
        except jwt.ExpiredSignatureError:
            print( "expired" )
            renew_bool = True

    if renew_bool:
        certs = pem.parse_file( pemfile )
        private_key = str( certs[0] )

        client = OAuth2Session(  client_id=client_id,  client_secret=private_key,  \ 
                     token_endpoint_auth_method="private_key_jwt"  )
        client.register_client_auth_method(  PrivateKeyJWT( token_url )  )
        resp = client.fetch_token( token_url, grant_type="client_credentials" )

        newtoken = resp[ "access_token" ]
        print( newtoken )

        with open( rawfile, 'w' ) as myfile:
            myfile.write( newtoken)

        return newtoken

I will need to sit with Steve now to validate the results from this new code and compare with results from using current code.

mambelli commented 11 months ago

The code was merged to master (2.0) PR #468 and 1.7 PR #470 (1.7.5)

StevenCTimm commented 11 months ago

Deployed in 1.7.5 and it works

StevenCTimm commented 11 months ago

For version 1.7.5 it was also necessary to make a SourceProxy This is it:

cat NerscSFApiSourceProxy.py from decisionengine.framework.modules import Source, SourceProxy

NerscSFApiSourceProxy = SourceProxy.SourceProxy Source.describe(NerscSFApiSourceProxy)

StevenCTimm commented 11 months ago

Suggest this be committed back to git.

StevenCTimm commented 11 months ago

This also required the "pem" and "Authlib" pip libraries to be added to the machine, have to modify our puppet accordingly.

StevenCTimm commented 11 months ago

So only one more thing we need.

Found in channel cms_resource_request +----+---------------+--------------+-------+-----------------------+----------------------+-------------+ | | hours_given | hours_used | id | project_hours_given | project_hours_used | repo_name | |----+---------------+--------------+-------+-----------------------+----------------------+-------------| | 0 | 600000 | 468013 | 54807 | 600000 | 468469 | m2612 |

The data block doesn't include the user name, although it does include the user id. It would be very helpful if it could include the user id as well.

StevenCTimm commented 7 months ago

Fixed. Closing this.