Open amy-silcock opened 3 years ago
Just bumping this -- it would be great to make this mapping publicly available, e.g. for tracking the recent change from ec-devco
to ec-intpa
I understand this work is starting up soon, which is great.
@jodiegardiner Can I confirm that the plan is to create two public mappings – one for registry identifiers (e.g. ec-echo
) and one for org IDs (e.g. XI-IATI-EC_ECHO
). Thanks
Hi @andylolz,
That's the understanding right now yes. We'll be on this one early next week.
Right now I'm thinking to just have a straightforward mappings.json available on a public url. Is that sufficient for the requirements?
Right now I'm thinking to just have a straightforward mappings.json available on a public url. Is that sufficient for the requirements?
That sounds fine, yes. A couple of questions:
{
"registry IDs": [
{
"before": "agriprofocus",
"after": "nfp",
"date": "2021-09-03"
},
{
"before": "lftwnl",
"after": "syf",
"date": "2021-06-09"
},
{
"before": "ec-devco",
"after": "ec-intpa",
"date": "2021-04-30"
}
],
"organisation IDs": [
{
"before": "47111",
"after": "XM-DAC-47111",
"date": "2021-09-02"
},
{
"before": "BE-BCE_KBO-0453475391",
"after": "BE-BCE_KBO-0453975341",
"date": "2021-06-26"
},
{
"before": "XI-IATI-EC_DEVCO",
"after": "XI-IATI-EC_INTPA",
"date": "2021-04-30"
},
{
"before": "46004",
"after": "XM-DAC-46004",
"date": "2021-04-29"
},
{
"before": "30001",
"after": "CH-FDJP-CHE-110347351",
"date": "2021-02-20"
}
]
}
Right now I'm thinking to just have a straightforward mappings.json available on a public url. Is that sufficient for the requirements?
As mentioned in https://github.com/devinit/D-Portal/issues/503#issuecomment-448370692, it would be great for a publisher to have a list of previously used identifiers included in their registry data. This seems like something important that the registry should be used to keep track of centrally for the good of everyone using IATI data.
@andreaszenasidi @jodiegardiner @andylolz @amy-silcock @notshi
we can include the details in the existing endpoint
Only for publisher member:
/api/action/organization_show?id=<your publisher id>
However, I don't think so it's possible to get the change in organization IDs, because the new version of CKAN activity doesn't store the organization extra changes (it tracks only package extra changes). New ideas are always welcome :)
Please let me know if you want me to go ahead with this.
@andreaszenasidi @jodiegardiner @andylolz @amy-silcock @notshi
we can include the details in the existing endpoint
Only for publisher member:
/api/action/organization_show?id=<your publisher id>
However, I don't think so it's possible to get the change in organization IDs, because the new version of CKAN activity doesn't store the organization extra changes (it tracks only package extra changes). New ideas are always welcome :)
Please let me know if you want me to go ahead with this.
Adding the list of old publisher ids to the existing endpoint sounds good to me.
@andylolz @notshi please let Swaroop know if there are any concerns with this approach
Great to hear this is progressing. If done correctly, this will be extremely useful.
Can I reiterate the questions I asked above:
@gtkChop could you please address the above questions? Many thanks!
@andreaszenasidi @andylolz
Ideally, it should. However, we need to apply a limit of 20, to avoid load on the database. I doubt that there are incidence where the publisher ids are changed more than 20 times.
Theoretically yes. Its select query that should be real-time
I will work on staging and will let you know.
@andylolz @andreaszenasidi
This is the sample:
Request URL:
/api/action/organization_show?id=cavwoc&show_historical_publisher_ids=true
Great - thank you. Sample output and request looks good to me.
What do you think, @notshi?
Thanks, @andylolz for being on the ball on this!
This looks like changes of slug names; ie. CKAN internal ids and not publisher_iati_ids
which is what we need.
Theoretically, not sure how useful this is as the slug names are not used in IATI data. Am I missing something?
There needs to be some distinction between CKAN and IATI terms.
@gtkChop could you confirm please if the _oldname field in the response is indeed the Publisher Id field from the registry account? I believe @notshi was referring to this in the above comment.
If it is, then please go ahead and implement this. Thanks!
@notshi is correct – publisher_iati_id
here refers to the publisher’s organisation identifier, or "org ID". (On the registry frontend, this is referred to as the “Identifier”.) I think it’s also what @gtkChop was referring to as the organization ID here:
However, I don't think so it's possible to get the change in organization IDs, because the new version of CKAN activity doesn't store the organization extra changes (it tracks only package extra changes). New ideas are always welcome :)
In the API response, name
, old_name
and historical_publisher_id
all refer to the registry slug for the publisher (sometimes called the publisher ID).
@andreaszenasidi @andylolz @notshi
Yes, the old_name is confusing, old_name => previous publisher_iati_id's
To be more clear please see below changes:
@andreaszenasidi I am pushing this prod and we can improve iteratively if needed. Thanks
old_name => previous publisher_iati_id's
Apologies, I think that is incorrect. The publisher_iati_id
is the org ID. In the example you posted, it’s MW-CNM-C057/1998
. That’s an org ID using the MW-CNM
prefix.
name
and old_name
refer to the registry slug (sometimes confusingly called the publisher ID). So I think renaming old_name
to old_publisher_iati_id
is actually more confusing than before.
@andylolz Yes you are right. Changing back to old_name.
@andreaszenasidi
Please see: https://iatiregistry.org/api/action/organization_show?id=pstc_1102&show_historical_publisher_names=true
To see historical publisher names - user should be a member of the given publisher or sysadmin
To see historical publisher names - user should be a member of the given publisher or sysadmin
Is the plan to make this publicly available? I.e. is it temporarily private while @andreaszenasidi reviews?
@gtkChop I can now see the historical_publisher_names. Looks good to me. The only issue is that this information is only returned when I am logged in as a sysadmin to the Registry. This data should be available to the public, not only to the registry users.
@andreaszenasidi
I have made historical_publisher_names public in staging. However, there are a few things I would like to bring to your notice -
If all good, i will deploy this to production
@gtkChop looks good on staging. You can deploy it to production. Thank you!
@andreaszenasidi Done deployed
@gtkChop thank you!
@andylolz @notshi this was deployed. If you run into any issues please let us know.
Many thanks for the update and making this live. I can confirm that we have access to that page.
Just wondering - as per your example url, it looks like there are 4 old_name records for this publisher. However, 2 are repeated so technically there should only be 2 old_name records. Is this expected?
Also, will this url be updated and maintained (forever)?
Hi @andreaszenasidi @gtkChop,
Upon closer inspection, it looks like the old_names are CKAN slug names so this is not useful at all.
We need the publisher_iati_id
as was mentioned in this previous comment and the comments thereafter.
As such, please re-open this issue.
@notshi thanks for testing this! Seems like the name change activity was recorded at a different time, hence they appear multiple times. If this is not useful, we could return the earliest name change. For ex. for pstc_bgd_24493 we can just return the first occurrence?
Could you please clarify where you see the slug names? These were indeed the old publisher id's for this publisher. If you check another publisher, for example https://iatiregistry.org/api/action/organization_show?id=ec-intpa&show_historical_publisher_names=true you can see that the old_names include ec_devco, ec-devco etc.
@andreaszenasidi
As per the example you've provided, you will get the following.
publisher_iati_id: "XI-IATI-EC_INTPA"
name: "ec-intpa"
old_name: "ec_devco"
old_name: "ec-devco"
name
and old_name
are both CKAN ids aka the slug names of the publisher.
They also form the url of the dataset, ie. https://iatiregistry.org/publisher/ec-intpa
As such, the historical publisher old names here only reflect changes when publishers change their Registry (CKAN id) url and not their IATI identifier.
We are only interested in the publisher_iati_id
as these are organisation identifiers used in IATI data outside of CKAN.
The original issue was to list the historical changes of IATI publisher (org) identifiers so that we can redirect old pages to their new ones on d-portal.
So for our example https://github.com/devinit/D-Portal/issues/503, there are no old_names found for this publisher as although they have changed their IATI identifier, they have not changed their CKAN id.
https://iatiregistry.org/api/action/organization_show?id=wvi&show_historical_publisher_names=true
@notshi thanks for clarifying!
This ticket (and its predecessor, #218) is called:
Create list of changed org and publisher ids
(emphasis mine). I.e. two lists for the two types of IDs. Both of these lists would be useful.
The lack of consistent naming for these two different identifiers is a cause of confusion. It would be great to also clear that up.
@andreaszenasidi Please describe the actions that you expect to be addressed in this issue
@adrianoamaral the ticket had 2 requirements.
Currently, only the 1st requirement was implemented. Similar to the 1st list that is now available under the organization_show endpoint we need an additional section under this endpoint that will list all the historic IATI Organisation Identifier changes.
Work was done last year to create a mapping of publisher ids on the IATI Registry. Found here: https://iatiregistry.org/ckan-admin/iati-redirects Issue: https://github.com/IATI/ckanext-iati/issues/218
However, this is only available for Registry Sysadmin accounts. It's not available to various IATI tool providers like the d-portal devs.
We need two public mappings
This is what the current sysadmin mapping for publisher ids looks like:
See the discussion on user need here: https://github.com/devinit/D-Portal/issues/503