wikipathways / GPML2RDF

GPML2RDF converter
Apache License 2.0
4 stars 2 forks source link

include author information #33

Open egonw opened 7 years ago

egonw commented 7 years ago

Which is needed downstream for nanopublications, but actually for any system that uses the citation expectations for WikiPathways outlined at

DeniseSl22 commented 6 years ago

But this is not stored in the GPML, since it is only registered on the Website, right?

egonw commented 6 years ago


AlexanderPico commented 4 years ago

This may or may not be helpful, but here is how I've extracted relevant author information per pathway directly from the database:

SELECT page_title AS WPID, rev_user AS userID, user_name AS userName, user_real_name AS realName, COUNT(rev_user) AS editCount, MIN(rev_timestamp) AS firstEdit 
FROM revision LEFT JOIN page ON revision.rev_page = page.page_id 
LEFT JOIN user ON revision.rev_user = user.user_id 
WHERE page_namespace = 102 AND page_title LIKE 'WP%' 
GROUP BY page_title, rev_user;

For RDF purposes, however, it may be more prudent to wait for the author information to be added to the GPML and then pull it from there.

AlexanderPico commented 4 years ago

Here are prototype ttl files with author strings and as wikidata entries (where available):

egonw commented 4 years ago

Ah, thanks for starting this. Some comments:


@prefix xsd:   <> .
@prefix gpml:  <> .
@prefix dc:    <>
@prefix foaf:  <>

        dc:creator            <> .

        a             foaf:Person ;
        foaf:name     "Nathan Salomonis" ;
        foaf:homepage <> .

And similarly for the ORCID part, I propose:

        owl:sameAs    <> ;
        foaf:name     "Alexander R. Pico" ;
        dc:identifier <> .
AlexanderPico commented 4 years ago

Right. When this gets added to the wp ttl files, the pathway revision informaiton will already be in hand (as it is currenlty). I can update the rest of the data in the prototype files to follow your proposal.

Question: should we not use gpml:author anywhere? Perhaps it's just not needed given the dc and foaf definitions already in use?

AlexanderPico commented 4 years ago

@egonw, @ariutta Changes made per your suggestions, combining all info into a single file per pathway. See new zip of ttls at and a sample here:

@prefix xsd:   <> .
@prefix gpml:  <> .
@prefix dc:    <> .
@prefix owl: <>.
@prefix foaf:  <> .

        dc:creator            <> , <> , <> , <> , <> , <> .

        a                    foaf:Person ;
        foaf:name            "Andra Waagmeester ;
        owl:sameAs           <> ;
        dc:identifier        <> ;
        foaf:homepage        <> .

        a                    foaf:Person ;
        foaf:name            "Thomas Kelder ;
        foaf:homepage        <> .

        a                    foaf:Person ;
        foaf:name            "Alexander Pico ;
        owl:sameAs           <> ;
        dc:identifier        <> ;
        foaf:homepage        <> .

        a                    foaf:Person ;
        foaf:name            "Kristina Hanspers ;
        foaf:homepage        <> .

        a                    foaf:Person ;
        foaf:name            "Egon Willighagen ;
        owl:sameAs           <> ;
        dc:identifier        <> ;
        foaf:homepage        <> .

        a                    foaf:Person ;
        foaf:name            "Denise Slenter ;
        owl:sameAs           <> ;
        dc:identifier        <> ;
        foaf:homepage        <> .
egonw commented 4 years ago

Nice! I do spot a missing quote at the end of the name:

foaf:name            "Andra Waagmeester ;
AlexanderPico commented 4 years ago

Good catch! Fixed and reupladed to