ceurws / lod

Anything we need to maintain the Linked Open Data (LOD) publication of CEUR-WS.org
16 stars 2 forks source link

CEUR-WS metadata graph/tree procurement #22

Open WolfgangFahl opened 3 years ago

WolfgangFahl commented 3 years ago

Input:

see e.g. https://github.com/WolfgangFahl/ProceedingsTitleParser/blob/master/ptp/ceurws.py

WolfgangFahl commented 3 years ago

see

WolfgangFahl commented 3 years ago

see https://github.com/WolfgangFahl/ProceedingsTitleParser/blob/4284bc33a29479eba6332e02c7108176425dadc1/ptp/ceurws.py#L92 and the rdfA parser code in https://github.com/WolfgangFahl/ProceedingsTitleParser/blob/4284bc33a29479eba6332e02c7108176425dadc1/ptp/webscrape.py#L103

WolfgangFahl commented 3 years ago

Using py-3rdparty-mediawiki library code it should be possible to create pages for each volume in a systematic/way. The template for the pages needs to be specified!

For copyright reasons we'll have to start the trial from the most current volumes and work our way backwards. We'll start with a few dozen pages.

CC0 is available for around Volume 15xx up

WolfgangFahl commented 3 years ago

A Jinja 2 template could describe the page

WolfgangFahl commented 3 years ago

Output should be like https://confident.dbis.rwth-aachen.de/ceur-ws/index.php?title=Vol-2801 taken from the html as shown in the talk page ...

WolfgangFahl commented 3 years ago

https://github.com/ailabitmo/sempubchallenge2014-task1 shows a solution done a few years ago which is IMHO way too complex but might have useful bits and pieces of code and background information.