mintproject / ModelCatalog

A repository containing the resources needed to create a catalog of software model models and link them together
Other
1 stars 3 forks source link

Model catalog: Population and linking

This repository contains the resources necessary to populate and curate the model catalog. The repository is organized as follows (see respective readme files in subfolders to know more about each of their contents):

Process to export the model catalog.

Just execute exportModelCatalog.py. There is a config.yaml script to indicate the graphs desired to export (each user has a graph, right now it is configured to extract the mint and texas graphs). As a result, the script will write a series of CSVs, where the first column represents an instance, the header of each column represents the property and the cell rows represent the different values.

Process to populate the model catalog.

1) Execute CSVToRDF. Compile and run the Java project, which will create an initial version of the turtle file with all contents from the Data folder integrated and linked. You should point the folder produced by the export Python script.

2) Extract units from labels and connect to WikiData:

You need to have access to the CCUT repository with the docker file: https://github.com/usc-isi-i2/mint-data-catalog/tree/ccut-dev

1. clone the mint-data-catalog github repository 
2. (move to ccut_docker branch)
3. cd to ccut_docker
4. build image
    docker build -t ccut_docker .
5. run image (a flask server)
    docker run -d -p 5000:5000 ccut_docker run -h 0.0.0.0 -p 5000
6. cd UnitToRDF (folder in this repository)
7. run the following query:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX mc: <https://w3id.org/mint/modelCatalog#>
PREFIX dc: <http://purl.org/dc/terms/>

SELECT distinct ?units where {
    ?a mc:usesUnit ?u.
    ?u rdfs:label ?units
}
On https://endpoint.mint.isi.edu/ (selecting the model catalog) and download the results as JSON in the /run folder replacing "Units.ttl"
7. Run python UnitToRDF. This is an interactive process where the program will ask the user in case of ambiguous terms. It works by matching the wikidata symbol with the symbol of the unit (exact match)

3) Generate Scientific Variable Links.

  1. cd GSNVariableImport
  2. python gsnvariableimport.py i The "i" option makes the process interactive. If "a" is entered instead, the system will always pick the first definition found.