cern-sis / issues-scoap3

0 stars 0 forks source link

Need to adapt the script for Russian affiliations #168

Closed agentilb closed 1 year ago

agentilb commented 1 year ago

Russian affiliations are now presented this way: "Affiliated with an institute or an international laboratory covered by a cooperation agreement with CERN" or "Affiliated with an institute covered by a cooperation agreement with CERN" "Affiliated with an international laboratory covered by a cooperation agreement with CERN" But it then wrongly translates the country to "CERN" This needs to be corrected. Can we create a country "UNLISTED" instead if this doesn't exist?

https://repo.scoap3.org/search?page=1&size=20&q=%22covered%20by%20a%20cooperation%20agreement%20with%20CERN%22

It seems that in some cases, the affiliation field is not even created: https://repo.scoap3.org/records/77862

ErnestaP commented 1 year ago

Hi Anne, I don't see which author doesn't have the affiliations. I see 2877 authors and the same amount of affiliations fields. If the affiliation is not found, we have a HUMAN CHECK. Changing the value to UNLISTED, I believe we need to discuss it with @drjova or @pamfilos .

Talking about Affiliations, how it has to be translated? To Russia?

agentilb commented 1 year ago

Hi Ernesta,
No the affiliation shouldn't be Russia!! (this is exactly what we need to avoid...) This is different from HUMAN CHECK, because from now on all Russian authors from CERN collaborations will be presented this way. So we need to find a stable way to translate this "country", and using HUMAN CHECK mixes them with the records that need to be checked. But yes, it is probably be better to discuss this question voice,

Anne

ErnestaP commented 1 year ago

Do you prefer to translate this exact affiliation to UNLISTED?

agentilb commented 1 year ago

My proposal was not to translate the AFFILIATION, but the COUNTRY. The affiliation can stay as it is.

If the affiliation contains: "Affiliated with an institute or an international laboratory covered by a cooperation agreement with CERN", or "Affiliated with an institute covered by a cooperation agreement with CERN", or "Affiliated with an international laboratory covered by a cooperation agreement with CERN" -> the country should be UNLISTED.

ErnestaP commented 1 year ago

yes, I understood, maybe I put it the wrong way: change the affiliation.country to UNLISTED. Harris should come back from holidays next week. If it's urgent, we can ask @pamfilos input for a decision as well :)

agentilb commented 1 year ago

Let's wait till Harris is back.

agentilb commented 1 year ago

Suggestion from Kamran: use N/A (NOT APPLICABLE) instead of UNLISTED.
This is more neutral.

drjova commented 1 year ago

We can just drop the country in authors if the affiliation matches exactly:

"Affiliated with an institute or an international laboratory covered by a cooperation agreement with CERN", or
"Affiliated with an institute covered by a cooperation agreement with CERN", or
"Affiliated with an international laboratory covered by a cooperation agreement with CERN"
drjova commented 1 year ago

@agentilb these are not enough, we have many corner cases, and since some of them contain CERN we assign CERN as country. Is it any other systematic way to distinguish these cases?

agentilb commented 1 year ago

Can you give me an exemple of those corner cases? In principle, they should all have this string: "cooperation agreement with CERN". Does it help?

drjova commented 1 year ago

Example:

Affiliated with an Institute Covered by a Cooperation Agreement with CERN, Geneva, Switzerland

drjova commented 1 year ago

In principle, they should all have this string: "cooperation agreement with CERN". Does it help?

I can do something with this, but if for some reason we use this string for something else we will not populate the country, meaning it will override all other cases.

agentilb commented 1 year ago

Checking the value in the repo, this is not the case for the time being. I believe we can take the risk

drjova commented 1 year ago

@agentilb the fix has been deployed to prod.

agentilb commented 1 year ago

Hi Harris, Thanks a lot! Is it possible to correct all the articles that are already in the repo?

drjova commented 1 year ago

@agentilb I created a new task for that Fix author's with wrong affiliations