biopragmatics / bioregistry

📮 An integrative registry of biological databases, ontologies, and nomenclatures.
https://bioregistry.io
MIT License
107 stars 47 forks source link

Add all prefixes listed in `json-ld-rc`, a recommended context for json-ld? #332

Open matentzn opened 2 years ago

matentzn commented 2 years ago

There are only a few prefixes defined by https://github.com/w3c/json-ld-rc, which we could align with.

matentzn commented 2 years ago

Discussion of partial interest: https://github.com/w3c/json-ld-bp/issues/9

cthoyt commented 2 years ago

You mean this specifically https://github.com/w3c/json-ld-rc/blob/main/prefixes.ttl?

matentzn commented 2 years ago

I think so, yes.

cthoyt commented 1 year ago
import pandas as pd

import bioregistry

url = "https://raw.githubusercontent.com/w3c/json-ld-rc/main/prefixes.ttl"

def _main():
    df = pd.read_csv(url, sep=" ", usecols=[1, 2], squeeze=True)
    for prefix, uri_prefix in df.values:
        prefix = prefix.removesuffix(":")
        uri_prefix = uri_prefix.removeprefix("<").removesuffix(">")
        np = bioregistry.normalize_prefix(prefix)
        if not np:
            print("MISSING", prefix, uri_prefix)
        elif bioregistry.get_uri_prefix(prefix).split("://")[1] != uri_prefix.split("://")[1]:
            print("MISMATCH", prefix, uri_prefix)

if __name__ == "__main__":
    _main()