gbv / jskos-data

Collection of knowledge organization systems encoded in JSKOS format
Creative Commons Zero v1.0 Universal
9 stars 4 forks source link

Add SfB and ASB #38

Open nichtich opened 2 years ago

nichtich commented 2 years ago

See bachelor thesis https://nbn-resolving.org/urn:nbn:de:bsz:14-qucosa2-211701 to follow on.

guitarster commented 2 years ago

Both, SfB and ASB, use a Wiki for maintaining their classification:

So, converting them to JSKOS would mean a lot of copy and paste work. So did the author of the bachelor thesis (s. page 50, last paragraph). Or is there a thinkable solution to scrape the content of the wiki with a script (e.g. written in Python)?

guitarster commented 2 years ago

https://www.crummy.com/software/BeautifulSoup/bs4/doc/ seems to bee an option. Will try it out with ASB some time.

nichtich commented 2 years ago

There is an API to MediaWiki, e.g. https://www.sfb-online.de/wiki/api.php. In particular method parse to get a page: https://www.sfb-online.de/wiki/api.php?action=help&modules=parse

example query:

Either use Wikitext, e.g.

curl -s 'https://www.sfb-online.de/wiki/api.php?action=parse&page=BID&prop=wikitext&format=json' | jq -r '.parse.wikitext[]'

Or HTML and process with BeatifulSoup.