Closed chipmasters closed 4 years ago
Nevermind. After all that, I see the issue is in the SDD Codebook first column. There are missing entries which caused the problem. I will close this tomorrow once I confirm it all works as expected after making those fixes.
@jimmccusker It seems that the sdd2owl process is generating the same URI for missing classes that are unrelated. Here is what was generated for the SDD-2016-1449-5-Outcome.xlsx file: In the Dictionary Mapping we have this row:
and in the Codebook we have these mappings for Dx_alg (after running sdd2owl)
Now in SDD-2016-1449-2-Covars.xlsx we have these rows in the Dictionary Mapping:
and these mappings in the Codebook (after running sdd2owl):
Note that not only is the same URL (http://purl.org/twc/ctxid/cbbaaa6673fc79651ef21fe792de2a8f4a01b1d154e4608bd8dbc4423ec29d2ad6) generated for each of the distinct cases in the Covars file above, but it is also the one generated in the Outcome file.
It seems that this is always the URL generated when the Attribute in the Dictionary mapping is sio:SIO_010056 (Phenotype). However this is clearly wrong, since for the case where GDM=0, the URL is supposed to represent all the instances of sio:SIO_010056 that are not instances of doid:11714, right? So how can the same URL be used for the absence of Type-2 diabetes, Type-1 diabetes, etc.? There should be distinct URLs generated for each of these cases, right?
I can send you the relevant SDDs for testing via email if you agree this is a bug. If you don't think this is a bug, we still need to discuss how to address the resulting behavior in HADatAc, because now the presence of this URL in multiple mappings is causing problems in the facet search.