mapping-commons / sssom-py

Python toolkit for SSSOM mapping format
https://mapping-commons.github.io/sssom-py/index.html#
MIT License
49 stars 12 forks source link

Better handle URI fields when getting prefixes from metadata #420

Closed cthoyt closed 1 year ago

cthoyt commented 1 year ago

Closes #419, cc @jmillanacosta

This PR adds a list of fields that should be given as URIs to be skipped when looking through metadata associated with a given mapping set dataframe so that http and https aren't interpreted as CURIE prefixes.

@hrshdhgd is there a way to introspect over the SSSOM schema and identify fields that are only supposed to be URIs (and therefore not CURIEs)?

hrshdhgd commented 1 year ago

From url.py -> inject_metadata_into_df()

https://github.com/mapping-commons/sssom-py/blob/035bbd447322e867b5f0800ae23fc43031c32b4c/src/sssom/util.py#L845-L846

You can get types using schema['slots'][**name-of-slot**]['range'] gives you the type of the slot.

For e.g.: schema['slots']['subject_id']['range'] = 'EntityReference' and the definition of EntityReference is here which is a uriorcurie.