arademaker / SLattes

Semantic Lattes
Other
40 stars 15 forks source link

+Title: Semantic Lattes

+Author: Alexandre Rademaker

The purpose of this project is to bridging the gap between the Brazilian Curriculum Lattes plataform and [[http://en.wikipedia.org/wiki/Semantic_Web][Semantic Web]] technologies. As part of the project, we have already developed:

  1. A XSLT stylesheet that process XML Brazilian [[http://lattes.cnpq.br/][Curriculum Lattes]] files and generates [[http://en.wikipedia.org/wiki/Resource_Description_Framework][RDF]] data.

  2. An updated version of the Lattes XML DTD specification. The original DTD provided by CNPq was last updated in 2004.

  3. A XSLT stylesheet that process XML Latttes and generate an [[http://www.loc.gov/standards/mods/][MODS XML]] file.

Usage:

+BEGIN_EXAMPLE

xsltproc --stringparam ID LATTESID lattes.xsl CURRICULIUM.xml > CURRICULIUM.rdf

+END_EXAMPLE

where =LATTESID= is an unique identifier that will be used to create the =xml:base= URL if the XML doesn't have the =NUMERO-IDENTIFICADOR= attribute in the tag =CURRICULO-LATTES=. We also provide another stylesheet called =lattes-vivo.xsl= that generates a RDF [[http://vivoweb.org][VIVO]] compliance, that is, readdy to be ingest into a VIVO instalation.

In Unix/MacOS systems you can also run:

+BEGIN_EXAMPLE

xmllint --schema CurriculoLattes.xsd --noout CV.xml

+END_EXAMPLE

If one wants to compare what we changed from the original DTD provided by CNPQ, just use the github diff of commits.

We have also developed a XSLT stylesheet that transforms a XML Lattes into a XML in [[http://www.loc.gov/standards/mods/][MODS]] format.

You will need the following tools to actually obtain a BibTeX file from your XML Lattes:

To transform Lattes to BibTeX:

+BEGIN_EXAMPLE

xsltproc lattes2mods.xsl LATTES.xml > LATTES.mods xmllint --schema mods.xsd LATTTES.mods xml2bib -b -w LATTES.mods > LATTES.bib

+END_EXAMPLE

The second command above is not necessary. It is how one can validate the MODS XML produced by the transformation. To run this command, you will need to download the [[http://www.loc.gov/standards/mods/mods-schemas.html][mods.xsd]] first.

Using bibtool command, you can also fix the citekeys, sort the entries and merge BiBTeX files. For instance, the command below generate the NEW.bib file with keys like "2012:Rademaker.Hermann" and the entries will be sorted in the inverse order by these keys.

+BEGIN_EXAMPLE

bibtool -f "%4d(year):%n(author)" -s LATTES.bib > NEW.bib

+END_EXAMPLE