Open balhoff opened 3 years ago
@balhoff - confirming I understand, this bug is related to: https://github.com/biolink/biolink-model/issues/652 (Review and curate Biolink Model prefix/URI namespaces for internet resolvability)? :)
@sierra-moxon unfortunately not. :-) It's a very twisted problem related to the JSON-LD context limitations described farther down in #301, and caused by some surprise incompatible changes in JSON-LD 1.0. One fix would be to use new features in JSON-LD 1.1 contexts to force allowing prefixes ending in an underscore. But linkml doesn't have a JSON-LD 1.1 processor right now.
So I wanted to open this to make sure the severity of this problem wasn't lost. Maybe there is something else that can be done without waiting for JSON-LD 1.1 support.
very sorry about this.
can someone make a minimal example that replicates the problem and make a linkml ticket?
from #397 it seems you had someone working on this @balhoff ... but I don't see any open PRs?
from #397 it seems you had someone working on this @balhoff ... but I don't see any open PRs?
@hsolbrig merged https://github.com/biolink/biolinkml/pull/262 after a couple of fixes. That PR updated jsonldcontextgen to include the new "@prefix": "true"
entries (which are required for JSON-LD 1.1 to handle prefixes ending with underscores). It also adds a new command prefixmapgen
which generates a simple YAML file containing a prefix dictionary (it doesn't seem like biolink-model is using this yet).
I'm not sure how close linkml is to having a JSON-LD 1.1-powered prefix expansion system. @hsolbrig what do you think? I'm wondering if in the interim some more hacky approach should be taken, because the current RDF just contains incorrect IRIs.
https://github.com/biolink/biolink-model/issues/394 is another issue that will be aided by work done on this ticket, closing #394 as a duplicate.
aparently the upstream issue is fixed, so does this now work?
One indicator will be that this line:
says:
skos:exactMatch <http://purl.obolibrary.org/obo/RO_0002432> ;
I would like to keep this open until it's fixed in Biolink.
that was bizarre it looks like @deepakunni3 accidentally closed this via his fork which included my change that prematurely closed this....
and apologies this is taking so long!!!
any update on this?
Many improvements, but still a few issues, e.g., term IRIs in here:
<https://w3id.org/biolink/vocab/SequenceVariant> a linkml:ClassDefinition ;
OIO:inSubset <https://w3id.org/biolink/vocab/model_organism_database> ;
skos:altLabel "allele" ;
skos:broadMatch <https://w3id.org/biolink/vocab/SO:0001060> ;
skos:definition "An allele that varies in its sequence from what is considered the reference allele at that locus." ;
skos:exactMatch <https://w3id.org/biolink/vocab/GENO:0000002>,
<https://w3id.org/biolink/vocab/SIO:010277>,
<https://w3id.org/biolink/vocab/SO:0001059>,
<vmc:Allele>,
<wikidata:Q15304597> ;
One thing I wanted to note is that there are possibly two different kinds of prefix expansion issues here:
SO
)<wikidata:Q15304597>
. Here wikidata
becomes a protocol, not a prefix. In my opinion these usages should result in a LinkML parsing failure.This has gotten worse in several cases:
and many more. These prefixes are changed case, not expanded, and then turned into protocols for malformed IRIs.
Interestingly the prefixes are correctly expanded in biolink-model.owl.ttl
(that file didn't used to have mappings in it).
Compare:
biolink-model.ttl
:
<https://w3id.org/biolink/vocab/EnvironmentalFoodContaminant> a linkml:ClassDefinition ;
skos:inScheme <https://w3id.org/biolink/biolink-model> ;
skos:relatedMatch <chebi:78299> ;
biolink-model.owl.ttl
:
biolink:EnvironmentalFoodContaminant a owl:Class ;
rdfs:label "environmental food contaminant" ;
rdfs:subClassOf biolink:ChemicalEntity ;
skos:relatedMatch <http://purl.obolibrary.org/obo/CHEBI_78299> .
I took a look at the 3.6.0
version of biolink-model.ttl, and it looks like this problem has gotten a lot worse. I can't find this file for the 4.0.0
release.
Edit—maybe this is not exactly the same problem as before, but related in that CURIEs from biolink-model.yaml seem to have a lot of trouble being correctly expanded. Rather than being expanded incorrectly, the ones in the link above are just not expanded at all, and turned into invalid IRIs.
Hi,
I opened issues related to this before (e.g. #301 more than a year ago) but I think I made a mistake in closing that when opening #397. The simplified
prefix.yaml
proposal is still desired, but it doesn't fix the big problem thatbiolink-model.ttl
is filled with incorrect prefix expansions. Currently I use this file in a hacky way by manually changing a bunch of identifiers usingsed
, but it isn't a 100% fix, and it's really inconvenient.