Open nichtich opened 1 year ago
@nichtich this is great, thank you. Would you be able to provide the converter yourself? We will review and integrate it of course, and help you if you have issues figuring out what goes where.
I started implementation of writer at https://github.com/gbv/sssom-py/commit/e3a217d2241674b2b368108929c36b446c1712ce but
BTW: Where can I find real-world SSSOM data beyond test files as examples.
I started implementation of writer at https://github.com/gbv/sssom-py/commit/e3a217d2241674b2b368108929c36b446c1712ce but
how do I convert subject_id and object_id to full URI? same for creater_id (but I would recommend using full URIs for creator ids anyway because this is also easier when the data is created)
Look at the other converters: for external -> sssom, we have a utility, see for example Alignment API converter: https://github.com/mapping-commons/sssom-py/blob/d8992f42aabe78a8df5465b193f9d0fc63680eba/sssom/parsers.py#L801
For the other way, we use mostly LinkML, which itself (in case you want to do it manually), uses the curies package.
how to compare output of test conversion against expected output? Unit test seems to just convert but no result check?
@hrshdhgd will tell you details, but this is done through round tripping and "expected" serialisations in the test data folder.
BTW: Where can I find real-world SSSOM data beyond test files as examples.
If you need more let me know:
I've finished implementation of JSKOS output format, tested via
sssom convert $tsvfile -O jskos
with all of https://github.com/mapping-commons/mh_mapping_initiative/tree/master/mappings (works fine) and all of tests/data/*.tsv
:
basic.tsv
, basic2.tsv
, test_filter_sssom.tsv
and cob-to-external.tsv
are okbasic{3,4,5,6,7}.tsv
and test_annotate_sssom.tsv
raise KeyError: 'owl:subClassOf'
looks like examples are outdated, there is no "owl:subClassOf"bosch-wd-matches.tsv
raises KeyError: ''
need to investigatebasic-cliquesummary-stats.tsv
, basic-cliquesummary.tsv
, test_validation1.sssom.tsv
and basic.ptable.tsv
give a stack tracebasic-meta-external.tsv
raises WARNING:root:No prefix map provided (not recommended), trying to use defaults. KeyError: 'a'
bad_basic.tsv
and basic6.tsv
, basic-small.tsv
and basic_subset.tsv
raise ValueError: not enough values to unpack (expected 2, got 1)
at the least the first seems to be indended the other may be missing edge case in my codeThe file at https://github.com/monarch-initiative/mondo/tree/master/src/mappings does not work because it lacks mapping_justification
(which I would make optional).
Before finishing I need some help with running the tests (they fail locally and at current CI at GitHub) and to check which TSV files are meant to be valid SSSOM TSV and which are outdated or require some special handling.
This is awesome, thank you! @hrshdhgd will help you when he gets a chance!
Will you also provide a jskos reader? This would be the direction that is most valuable for us, at least :) 🙏
As discussed at https://github.com/mapping-commons/sssom/discussions/250#discussioncomment-4918548 the JSKOS output could be extended by mapping identifiers.
As discussed here JSKOS is another format also used to encode mappings. The format is defined on JSON. In practice newline delimited JSON with extension
ndjson
is used most for files with multiple mappings in JSKOS to facilitate processing with command line tools. I suggest:--output-format jskos
or-o file.ndjson
to emit JSKOS as newline delimted JSON. Output as JSON array is not needed.INPUT
file with extension.ndjson
to parse JSKOS mappings as newline delimted JSON--input-format jskos
to detect whether the first character is[
- if so, parse as JSON array of JSKOS mappings, otherwise parse as newline delimted JSON with JSKOS mappings. If too complex, just limit to newline delimited JSON.Example data (DDC-BK mappings):
By the way, there is also a TSV/CSV format for JSKOS but we may better align the tabular formats of SSSOM and JSKOS to a common format.