colav / impactu

Colav Impactu Issues and Documentation
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

[moai] bad quality names, implement bibtexparser #164

Open omazapa opened 3 months ago

omazapa commented 3 months ago

bibtexparser is not implemented to parse names and the names have poor quality

omazapa commented 3 months ago

@restrepo bibtexparser has a bug that have not been fixed by year, producing error in the accents with letter i,

ex: author={Rueda-Ram{\\'\\i}rez, Diana and Rios-Malaver, Diana and Varela-Ram{\\'\\i}rez, Amanda and De Moraes, Gilberto J},\n" produces: author: 'Rueda-Ramı́rez, Diana and Rios-Malaver, Diana and Varela-Ram\\ŕez, Amanda and De Moraes, Gilberto J',

I found a work around using pylatexenc

ex:

import bibtexparser
from bibtexparser.bparser import BibTexParser
from bibtexparser.customization import convert_to_unicode
from pylatexenc.latex2text import LatexNodes2Text

bibtex= '@article{rueda2018colombian,\n'
bibtex+='  title={Colombian population of the mite Gaeolaelaps aculeifer as a predator of the thrips Frankliniella occidentalis and the possible use of an astigmatid mite as its factitious prey},\n'
bibtex+="  author={Rueda-Ram{\\'\\i}rez, Diana and Rios-Malaver, Diana and Varela-Ram{\\'\\i}rez, Amanda and De Moraes, Gilberto J},\n"
bibtex+='  journal={Systematic and Applied Acarology},\n' 
bibtex+='  volume={23},\n' 
bibtex+='  number={12},\n' 
bibtex+='  pages={2359--2372},\n' 
bibtex+='  year={2018},\n' 
bibtex+='  publisher={BioOne}\n' 
bibtex+='}\n'

bibparser = BibTexParser(add_missing_from_crossref=True)
bibparser.ignore_nonstandard_types = False
bibparser.homogenize_fields = True
bibparser.common_strings = True

def fix_bibtex(entries):    
    for entry in entries:
        for key in entry.keys():
            entry[key]=LatexNodes2Text().latex_to_text(entry[key])

bparser = bibtexparser.loads(str(bibtex), bibparser)

fix_bibtex(bparser.entries)

print(bparser.entries)

[{'publisher': 'BioOne',
  'year': '2018',
  'pages': '2359–2372',
  'number': '12',
  'volume': '23',
  'journal': 'Systematic and Applied Acarology',
  'author': 'Rueda-Ramírez, Diana and Rios-Malaver, Diana and Varela-Ramírez, Amanda and De Moraes, Gilberto J',
  'title': 'Colombian population of the mite Gaeolaelaps aculeifer as a predator of the thrips Frankliniella occidentalis and the possible use of an astigmatid mite as its factitious prey',
  'ENTRYTYPE': 'article',
  'ID': 'rueda2018colombian'}]

this has to be implemented as well in Moai

omazapa commented 3 months ago

this was implemented in as a patch in in the db scholar_colombia_2024 but this has to be implemented in moai, no in scholar person. This a capture problem, leaving it to moai