hitontology / ontology

The Health IT Ontology.
https://hitontology.eu/
Creative Commons Zero v1.0 Universal
3 stars 2 forks source link

Add programming library/toolkit property #19

Closed KonradHoeffner closed 3 years ago

KonradHoeffner commented 4 years ago

On https://www.medfloss.org/, the "programming language" values aren't that strictly programming languages, many of them are libraries or toolkits, like Swing. To accommodate those, find and add a suitable library/toolkit property.

KonradHoeffner commented 3 years ago

Commit a04f4bbb128ea2ddee12d273fae4d734531e4524 adds the property hito:programmingLibrary with the range rdfs:Resource.

A more specific range would be better, however, as we cannot select any values for the property in the database frontend with this range.

KonradHoeffner commented 3 years ago

As an example, in DBpedia there is no specific type for https://dbpedia.org/resource/React_(web_framework). https://dbpedia.org/page/Spring_Framework has type yago:WikicatWebApplicationFrameworks.

Options

  1. remove the property
  2. use string values
  3. create instances in HITO based on the ones in medfloss.org

I prefer option 3 because that would be the easiest to transfer from medfloss.org and also easier for the frontend users. For that we need access to the medfloss database or another representation of it. We have an account from them so lets investigate, if we have some kind of mass data access option there.

KonradHoeffner commented 3 years ago

The medfloss account does not seem to provide database or other forms of mass access, so we need to create our own. We could also ask the site admins but let's just try real quick if a basic HTML scrapper does the trick.

https://www.medfloss.org/node/62?items_per_page=All lists all projects.

xmllint --recover --xpath '//div[@class="view-content"]//article/h2/a/@href' medfloss lists all the project hrefs.

KonradHoeffner commented 3 years ago

OK that is actually not necessary because all of the values for programming languages and libraries are listed in the project wizard at https://www.medfloss.org/project-wizard:

curl https://www.medfloss.org/project-wizard > taxonomy  
xmllint  --recover --xpath '//ul[@id="facetapi-facet-search-apinodesfacettedsearch-block-taxonomy-vocabulary-12"]/li/a/text()' taxonomy | tee /tmp/t
Java (99)
C++ (38)
Python (25)
PHP (22)
JavaScript (20)
Qt (13)
C (12)
C# (9)
Swing (9)
.NET (6)
Perl (6)
Ruby on Rails (6)
OpenGL (5)
Ruby (5)
wxWidgets (5)
Angular JS (4)
Cocoa (4)
Django (4)
Grails (4)
Object Pascal (4)
Pascal (4)
Android SDK (3)
CSS (3)
HTML (3)
jQuery (3)
Maven (3)
Mono (3)
Objective C (3)
SWT (3)
Eclipse (2)
Groovy (2)
GTK (2)
GTK+ (2)
IDL (2)
JSP (2)
MATLAB (2)
Objective-C (2)
Phyton (2)
R (2)
wxPython (2)
XML (2)
XSLT (2)
#Gtk (1)
ActionScript 3 (1)
Ajax (1)
AndroMDA (1)
ASP.NET (1)
BOOST (1)
Bootstrap (1)
Camel (1)
CMGUI (1)
DCMTK (1)
Dojo (1)
Drupal (1)
Ember (1)
FileMaker Pro (1)
Flex (1)
GD Library (1)
HTML5 (1)
ITK (1)
JAI (1)
Java AWT (1)
Java Swing (1)
JBoss (1)
JBoss Drools (1)
Matplotlib (1)
MITK (1)
Mule ESB (1)
MUMPS (1)
Netgen (1)
Octave (1)
Opal (1)
Perl DBI (1)
PHP; JQuery (1)
PHP5 (1)
PIL (1)
Prototype.js (1)
pydicom (1)
PyQt4 (1)
Quartz (1)
Sencha (1)
Smarty (1)
Tcl (1)
Tryton (1)
Vala (1)
Visual Basic (1)
VTK (1)
Web2Py (1)
XACML (1)
Zope (1)

grep -o "^[^(]*" /tmp/t > languagelibrary gives us the libraries without the occurrence count and we have to remove the programming languages.

KonradHoeffner commented 3 years ago

Manually removing programming languages and markup languages from list results in the following libraries:

The labels and uris are in the table https://docs.google.com/spreadsheets/d/1W0IgNZ2_lXc01pDYr601UspN2opafcsN9WpDToXdlSs.

KonradHoeffner commented 3 years ago

Its on in the repository and on the the SPARQL endpoint now, you can see the class and it's instances at https://hitontology.eu/ontology/ProgrammingLibrary.