culturecreates / artsdata-planet-iwts

IWTS stand for "I want to showcase"
0 stars 0 forks source link

2425-W-004 IWTS Artist data ETL #1

Open saumier opened 3 months ago

saumier commented 3 months ago

Work Order https://docs.google.com/document/d/1-agbLAyTtHIt2W-fTWsEWe--q6g_HRGnmAJd1yiaOws/edit#heading=h.phzh70cof3ne

### Tasks
- [ ] https://github.com/culturecreates/artsdata-planet-iwts/issues/2
- [ ] https://github.com/culturecreates/artsdata-planet-iwts/issues/3
- [ ] https://github.com/culturecreates/artsdata-planet-iwts/issues/4
- [ ] https://github.com/culturecreates/artsdata-planet-iwts/issues/5
- [ ] https://github.com/culturecreates/artsdata-planet-iwts/issues/6
saumier commented 2 months ago

Waiting for Frederic to communicate IWTS API so we can evaluate work.

SkipTay commented 2 months ago

Hi there. I have built a prototype for the API.

here are the key notes. as of right now I am waiting on the new fields to be implemented in the database from Ryan and Stephen.

I can provide the url for api access and the command line access. Not sure if this should be done here or in email. if this is a private project please advise its is secure and I can provide these items here.

Here are my current notes from the file. /*

saumier commented 2 months ago

Need help from Skip to call API.

See email:

Hi Skip,

Thanks for sending me access to your API in development.

I tried a call from my local PC but I get the following error (I replaced the key you sent me with stars)

https://iwanttoshowcase.ca/apiiwts.php?api_key=** —> {"error": "Unauthorized access. Invalid API key.", "received_api_key": "****" }

I also get the same error message with curl:

curl -H "X-API-Key: ***" https://iwanttoshowcase.ca/apiiwts.php
—> {"error":"Unauthorized access. Invalid API key.","received_api_key":""}

How do you suggest I proceed?

Regarding restricting access: I don’t have a permanent IP address.

Doc from Github: "For scripted calling of IP addresses we are using Github workflows hosted in Azure and subsequently have the same IP address ranges as the Azure datacenters. Since there are so many IP address ranges for GitHub-hosted runners, we do not recommend that you use these as allowlists for your internal resources."

From what I can research, the domain github.com http://github.com/ should allow you to restrict access. Perhaps we can test this after I figure out how to call your API from my local PC.

Regards, Gregory

SkipTay commented 2 months ago

Sorry, just sent the full key. For some reason only part of the key pasted in my first text msg.

Let me know now that you have the full key if you get the result

Skip Taylor Performing Arts Program Manager Organization of Saskatchewan Arts Councils (OSAC) Phone: (306) 586-1253 @.**@.>

From: Gregory Saumier-Finch @.> Sent: Tuesday, August 27, 2024 8:13 AM To: culturecreates/artsdata-orion @.> Cc: Skip Taylor @.>; Manual @.> Subject: Re: [culturecreates/artsdata-orion] 2425-W-004 IWTS Artist data ETL (Issue culturecreates/artsdata-planet-iwts#1)

Need help from Skip to call API.

See email:

Hi Skip,

Thanks for sending me access to your API in development.

I tried a call from my local PC but I get the following error (I replaced the key you sent me with stars)

https://iwanttoshowcase.ca/apiiwts.php?api_key=https://urldefense.proofpoint.com/v2/url?u=https-3A__iwanttoshowcase.ca_apiiwts.php-3Fapi-5Fkey-3D&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=ezweIHHKlQItpWOsa2u_EQ&m=dpEaYLAbJL577qfPyX8N7siEUb3B7Om-XHz3v1MSGa61pBXb-YcjvPfGJOKm2gJ4&s=XmCFoqQhisGgksJ8ineFFMr2enyz-W7oalWvfGle6nc&e=** —> {"error": "Unauthorized access. Invalid API key.", "received_api_key": "****" }

I also get the same error message with curl:

curl -H "X-API-Key: ***" https://iwanttoshowcase.ca/apiiwts.phphttps://urldefense.proofpoint.com/v2/url?u=https-3A__iwanttoshowcase.ca_apiiwts.php&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=ezweIHHKlQItpWOsa2u_EQ&m=dpEaYLAbJL577qfPyX8N7siEUb3B7Om-XHz3v1MSGa61pBXb-YcjvPfGJOKm2gJ4&s=Ec3b9Le_jS-igFMVI5g9BkpZ1LOPVyxCtybhqF_Byhc&e= —> {"error":"Unauthorized access. Invalid API key.","received_api_key":""}

How do you suggest I proceed?

Regarding restricting access: I don’t have a permanent IP address.

Doc from Github: "For scripted calling of IP addresses we are using Github workflows hosted in Azure and subsequently have the same IP address ranges as the Azure datacenters. Since there are so many IP address ranges for GitHub-hosted runners, we do not recommend that you use these as allowlists for your internal resources."

From what I can research, the domain github.comhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=ezweIHHKlQItpWOsa2u_EQ&m=dpEaYLAbJL577qfPyX8N7siEUb3B7Om-XHz3v1MSGa61pBXb-YcjvPfGJOKm2gJ4&s=yEqWpHznbys8QlVFROVstnYeJVQYdxWkwPiCZmR8KYY&e= http://github.com/https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=ezweIHHKlQItpWOsa2u_EQ&m=dpEaYLAbJL577qfPyX8N7siEUb3B7Om-XHz3v1MSGa61pBXb-YcjvPfGJOKm2gJ4&s=yEqWpHznbys8QlVFROVstnYeJVQYdxWkwPiCZmR8KYY&e= should allow you to restrict access. Perhaps we can test this after I figure out how to call your API from my local PC.

Regards, Gregory

— Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_culturecreates_artsdata-2Dorion_issues_49-23issuecomment-2D2312690533&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=ezweIHHKlQItpWOsa2u_EQ&m=dpEaYLAbJL577qfPyX8N7siEUb3B7Om-XHz3v1MSGa61pBXb-YcjvPfGJOKm2gJ4&s=QaJLQs3xRR3ktgV_j-d2VsZ-4iDLp4t4ZQTBlLlv8c8&e=, or unsubscribehttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_BKWYF6JGLYSLQ37RGDDVJXTZTSCNNAVCNFSM6AAAAABLW6RJRWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJSGY4TANJTGM&d=DwMFaQ&c=euGZstcaTDllvimEN8b7jXrwqOf-v5A_CdpgnVfiiMM&r=ezweIHHKlQItpWOsa2u_EQ&m=dpEaYLAbJL577qfPyX8N7siEUb3B7Om-XHz3v1MSGa61pBXb-YcjvPfGJOKm2gJ4&s=aKTDiUfsdvuDghKBIWay2-dQ3foGnHQmbeNR75QkvTo&e=. You are receiving this because you are subscribed to this thread.Message ID: @.**@.>>

saumier commented 2 months ago

@SkipTay Data received from API

[
  {
    "performerid": 4,
    "performer_name": "Skip Taylor",
    "performer_country": "Canada",
    "performer_website1": "http://www.osac.ca/",
    "performance_category": "Music",
    "performance_subgenre1": "Cabaret",
    "performer_pronouns": "He/Him"
  }
]

@SkipTay Is this a complete set of properties or are you planning to add/remove properties? I am wondering when I should begin working on mapping these to schema.org and/or wikidata.org?

SkipTay commented 2 months ago

That is correct. currently only sending 1 test artist. I am waiting for Ryan to complete his work and add the additional fields. Once the fields are added I will update the API to send all opt in artist profiles and all the requested fields.

SkipTay commented 2 months ago

Hi Gregory,

The new fields were added to the IWTS production tables. I have updated the API to include the new fields. I have sent the output to @fjjulien to confirm the output is as he expected, but if you want to review now you can using the same url provided earlier.

Currently it is still just the one test performer, but the field and JSON structure should be there.

fjjulien commented 2 months ago

I reviewed the output against the IWTS Open Data Model document. Everything is good except:

  1. For performer_country, the stored value should be an ISO 3166-1 alpha-2 code. There are detailed instructions and links in the document on how it should be implemented. According to a comment by Ryan in the doc, this should have been implemented and older values should have been replaced by Steven.
  2. For the performer_type, for interoperability purposes, the stored values should preferably be as per the Schema labels: Person, PerformingGroup, Organization. This said, this mapping can easily be done in Artsdata. Either way is fine. It's not as big a deal as the country two-letter codes.

Keep up the good work!

SkipTay commented 2 months ago

I haven’t officially got word from Ryan that the work was complete. I suspect they may run a query to replace the existing Country codes this weekend. I should have known the schema bit. Again I’ll wait until the work Ryan is doing is complete before coding that. Stephen may already be coding those values for performer-type. I simply put a value in that field to be sure it was returned.

SkipTay commented 1 month ago

HI all.

The fields are all working as expected now and I believe the output is in a format you will expect. I have sent an image to Frederic for the output. Gregory if you want to call the api and confirm the output suits your needs? The 3 profiles included currently are test profiles so they should not be loaded into OpenData, but they are marked with consent. Once you approve the output I will remove consent from these 3 profiles and we will ahve to wait for official consent from real profiles. I have attached a jpg of the json output for reference.

api json output

fjjulien commented 1 month ago

As I wrote over email, the output looks as agreed upon in the open data model. 👍

saumier commented 1 month ago

@fjjulien @SkipTay I'll review the API output with my team and complete the WO estimate.

saumier commented 1 month ago

@dev Please take a look at the output from the IWTS API and estimate an ETL into Artsdata. Here is a reference doc https://docs.google.com/document/d/14SHgIWuItkp8lnmeculEctfyLc0YV6CVuZRX4e8h0q0/edit but please estimate only the first step which is to load the data as-is using the same property names as the API with prefix https://iwanttoshowcase.ca/vocabulary#

SkipTay commented 1 month ago

The API will once we reconcile, include the Open Data Unique Identifier and the wikidata unique identifier as well. But that will take a bit of development on our side yet.

fjjulien commented 4 weeks ago

Note: I already mapped IWTS's Performance Category vocabulary to Artsdata and Wikidata. The mapping is available in this spreadsheet.