Lightweight ontology/DOSDP-aware CSV reader/writer

Example YAML: https://github.com/cmungall/environmental-conditions/blob/master/src/patterns/exposure_to_change_in_levels.yaml

Examples CSV: https://github.com/cmungall/environmental-conditions/blob/master/src/ontology/modules/exposure_to_change_in_levels.csv

Which generates this owl (which later is reasoned using robot)

The current workflow for a person to edit the csv and add new rows, cutting and pasting IDs. This may be done locally and committed, or drive-by using the github ascii editors. This is not ideal for less experienced users.

We have two use cases:

'TermGenie' - user needs to add a new row
Editor - needs to be able to edit, delete or add any row

The idea is to have a lightweight JS application (e.g. angular), with no serverside component (beyond serving up static files), that takes 3 URLs:

github URL for csv
github URL for yaml
autocomplete server

User edits CSV, with ontology IRIs filled in by autocomplete. The application would be highly similar to the old deskphenote, ie no semantics to the CSV as far as the client is concerned. Ideally the next step is to save to GH (master or PR depending on permissions). This requires Oath2. As a first pass for v1, it is acceptable to simply allow the user to copy and paste the csv into the github file editors screen (use case 2) or paste new rows into a structured ticket (use case 1)

Most of the semantics in the yaml would be ignored (in the first version). It would primarily be used to drive autocomplete constraints. We also need to be flexible with autocomplete services. E.g. allow to connect to arbitrary SPARQL server (ie regex based as @balhoff implemented for phenoscape) such as ontobee or arbitrary scigraph or arbitrary golr.

Any logic would be decoupled and serverside, e.g. triggered by travis or jenkins. (we can imagine future iterations with more direct feedback). The server would also take care of things like replacing UUIDs with lastIRI+1 (TG usecase).

CSV spec

arbitrary number of comments starting #

one header row

n data rows

row1 iri is IRI of the entry (typically a class). The client can generate UUIDs here
row2 iri label is the label of the entry. User can leave blank (will be filled in using default pattern from YAML)
row3 <= r < 3+2*|c| are alternating pairs or VAR and VAR_label. VAR must correspond to a var declared in the yaml. These will be filled in by autocomplete. CURIEs will be used.
subsequent rows (not for v1). Any annotation property, e.g. synonym, definition

Saving

The entire CSV can be saved and used as a PR (editor route)

Alternatively, if the operation is append only, then saving will generate a ticket to the appropriate tracker, with the CSV values entered in a structured block. A decoupled downstream agent will read these and act on them, perhaps based on whether an editor +1s them or labels them.

Ideally we would have different launch buttons for the different routes. We would advertise the append button for TG users. Of course anyone can use the full edit and PR route, but it's entirely up to ontology owners whether to accept non-append PRs.

INCATools / intelligent-concept-assistant

Lightweight ontology/DOSDP-aware CSV reader/writer #2

CSV spec

Saving