Open restrepo opened 1 year ago
The scholar ids and the bibtex (and authors) are in different variables within the resulting database from moai's process (shown below):
{
_id: ObjectId("629115506f6a44dc7d69ac40"),
author: 'Westermann, Olaf and Förch, Wiebke and Thornton, Philip and Körner, Jana and Cramer, Laura and Campbell, Bruce',
profiles: { 'W Förch': 'HwyJZC0AAAAJ', 'P Thornton': 'Wx_me7EAAAAJ' },
bibtex: '@article{westermann2018scaling,\n' +
' title={Scaling up agricultural interventions: Case studies of climate-smart agriculture},\n' +
' author={Westermann, Olaf and F{\\"o}rch, Wiebke and Thornton, Philip and K{\\"o}rner, Jana and Cramer, Laura and Campbell, Bruce},\n' +
' journal={Agricultural Systems},\n' +
' volume={165},\n' +
' pages={283--293},\n' +
' year={2018},\n' +
' publisher={Elsevier}\n' +
'}\n'
},
{
_id: ObjectId("629115506f6a44dc7d69a764"),
author: 'Restrepo, Héctor F and Rondón, Martı́n and Rojas, Mar\\á X and Torres, Yolanda and Aschner, Pablo and Dennis, Rodolfo J',
profiles: {
'HF Restrepo': 'k1YkH44AAAAJ',
'M Rondón': 'cDXnenAAAAAJ',
'MX Rojas': 'hunwMEsAAAAJ'
},
bibtex: '@article{restrepo2010comparacion,\n' +
" title={Comparaci{\\'o}n de la funci{\\'o}n pulmonar de pacientes con diabetes mellitus tipo 2 sometidos a tratamiento de insulina inyectada versus tratamiento con hipoglucemiantes orales},\n" +
" author={Restrepo, H{\\'e}ctor F and Rond{\\'o}n, Mart{\\'\\i}n and Rojas, Mar{\\'\\i}a X and Torres, Yolanda and Aschner, Pablo and Dennis, Rodolfo J},\n" +
" journal={ActA M{\\'e}dicA coloMbiAnA},\n" +
' volume={35},\n' +
' number={3},\n' +
' pages={113--118},\n' +
' year={2010},\n' +
" publisher={Acta M{\\'e}dica Colombiana}\n" +
'}\n'
}
In the profiles variable, names/keys are always shortened by initials and lastnames.
Right now I'm using thefuzz to relate authors field to profiles keys (check https://github.com/colav/Kahi_plugins/blob/main/Kahi_works/kahi_works/Kahi_works.py#L607C1-L641C40).
To implement this in the new code for the independent kahi_scholar_works plugin I would need a more precise definition of the mechanism to relate and rate the quality of the id assigned since we cannot use the first names.
Posible metodología:
author={Restrepo, H{\\'e}ctor F and Rond{\\'o}n, Mart{\\'\\i}n and Rojas, Mar{\\'\\i}a X and Torres, Yolanda and Aschner, Pablo and Dennis, Rodolfo J},
a
author={Restrepo, Héctor F and Rondón, Martín and Rojas, María X and Torres, Yolanda and Aschner, Pablo and Dennis, Rodolfo J},
keys = {"HF Restrepo": "Restrepo, Héctor F", "M Rondón": "Rondón, Martín", "MX Rojas": "Rojas, María X", "Y Torres":"Torres, Yolanda", "P Aschner":"Aschner, Pablo", "D Rodolfo J":"Dennis, Rodolfo J"]
use como claves del diccionario profiles:
new_profiles = dict([(keys(k), profiles.get(k) for k in keys if profiles.get(k) ])
Whenever comparing a work with the corresponding record in Google Scholar, extract the google_id for all the authors in the work.
Obtain the full names in the Google Scholar record from the more comprehensive bibtex info
Assign a quality of the type of normalized match accordingly to the following hierarchy: