Closed fkleedorfer closed 6 months ago
@fkleedorfer, I have updated the vocabulary files so that they pass all the validation tests in #838, which has been pushed to the main branch. So now we have a good new starting point. Would you mind merging main into this branch and resolving any merge conflicts? Then I will be able to evaluate this and the other PRs more easily.
Thanks.
Rebased onto main
Here's a status update from the validation run - down to about 10 units now!
I'll try to get back to reviewing your suggestions, but I'm a bit slammed at present.
Thanks.
For the record, here is how I run shacl validation (using a local installation of jena CLI tools):
cat vocab/unit/VOCAB_QUDT-UNITS-ALL-v2.1.ttl vocab/quantitykinds/VOCAB_QUDT-QUANTITY-KINDS-ALL-v2.1.ttl schema/SCHEMA_QUDT-v2.1.ttl schema/SCHEMA_QUDT-DATATYPE-v2.1.ttl > tmp.ttl && shacl validate --shapes collections/COLLECTION_QUDT_QA_TESTS_ALL-v2.1.ttl --data tmp.ttl
Would it make sense to document that somewhere, maybe schema/shacl/README.md ?
Fixed all SHACL problems and made the last required decisions. I'd say it's done pending a review and then removing the analysis comments. Note that code-comments have been shifted a bit by subsequent commits - most are still roughly in the right spot, but one or two were shifted so far they ended up in the triples of a different unit.
Getting very close now! Here's the validation result. Most seem clear what to do. The last one is because the referenced qk has two ::
I was going to publish the new release today, but if you think you can get these done I can wait until Monday...
Also, don't worry about removing the comments, as they will all be removed by TopBraid Composer when I generate the new release.
What about those qkdv triples - are they auto-generated with the release?
Unfortunately not. (You raise a good point, though, that they could be).
ok. So taking a random dv:
qkdv:A0E2L0I0M-1H0T4D0
a qudt:QuantityKindDimensionVector_ISO ;
a qudt:QuantityKindDimensionVector_Imperial ;
a qudt:QuantityKindDimensionVector_SI ;
qudt:dimensionExponentForAmountOfSubstance 0 ;
qudt:dimensionExponentForElectricCurrent 2 ;
qudt:dimensionExponentForLength 0 ;
qudt:dimensionExponentForLuminousIntensity 0 ;
qudt:dimensionExponentForMass -1 ;
qudt:dimensionExponentForThermodynamicTemperature 0 ;
qudt:dimensionExponentForTime 4 ;
qudt:dimensionlessExponent 0 ;
qudt:latexDefinition "\\(M^-1 T^4 I^2\\)"^^qudt:LatexString ;
rdfs:isDefinedBy <http://qudt.org/2.1/vocab/dimensionvector> ;
rdfs:label "A0E2L0I0M-1H0T4D0" ;
.
, how do I automate this?
Yes, they are all always applicable. It's only the weird CGS systems that have different base dimensions.
I probably won't manage the latex def. Is that a problem?
Let's go ahead without, and we can back-fill later. We don't validate against that at present.
ah, got it. Here's the query:
prefix qudt:<http://qudt.org/schema/qudt/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
construct {
?dv
a qudt:QuantityKindDimensionVector_ISO ;
a qudt:QuantityKindDimensionVector_Imperial ;
a qudt:QuantityKindDimensionVector_SI ;
qudt:dimensionExponentForAmountOfSubstance ?A ;
qudt:dimensionExponentForElectricCurrent ?E ;
qudt:dimensionExponentForLength ?L ;
qudt:dimensionExponentForLuminousIntensity ?I ;
qudt:dimensionExponentForMass ?M ;
qudt:dimensionExponentForThermodynamicTemperature ?H ;
qudt:dimensionExponentForTime ?T ;
qudt:dimensionlessExponent ?D ;
qudt:latexDefinition ?latexDefinition;
rdfs:isDefinedBy <http://qudt.org/2.1/vocab/dimensionvector> ;
rdfs:label ?dvLocalname ;
}
#select *
where
{
?entity qudt:hasDimensionVector ?dv
optional {
?dv ?any ?object
}
filter (!(bound(?object)))
BIND(REPLACE(STR(?dv), "^.+/", "") as ?dvLocalname)
BIND(REPLACE(?dvLocalname, "A(-?\\d)E-?\\dL-?\\dI-?\\dM-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?A)
BIND(REPLACE(?dvLocalname, "A-?\\dE(-?\\d)L-?\\dI-?\\dM-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?E)
BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL(-?\\d)I-?\\dM-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?L)
BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI(-?\\d)M-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?I)
BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM(-?\\d)H-?\\dT-?\\dD-?\\d", "$1") as ?M)
BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM-?\\dH(-?\\d)T-?\\dD-?\\d", "$1") as ?H)
BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM-?\\dH-?\\dT(-?\\d)D-?\\d", "$1") as ?T)
BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM-?\\dH-?\\dT-?\\dD(-?\\d)", "$1") as ?D)
BIND(CONCAT("http://qudt.org/vocab/dimensionvector/", "A",?A, "E", ?E, "L",?L, "I", ?I, "M",?M, "H",?H,"T",?T,"D",?D) = str(?dv) as ?extractionCorrect)
BIND(IF(?A != "0", CONCAT(" A^",?A), "") as ?Ax)
BIND(IF(?E != "0", CONCAT(" E^",?E), "") as ?Ex)
BIND(IF(?L != "0", CONCAT(" L^",?L), "") as ?Lx)
BIND(IF(?I != "0", CONCAT(" I^",?I), "") as ?Ix)
BIND(IF(?M != "0", CONCAT(" M^",?M), "") as ?Mx)
BIND(IF(?H != "0", CONCAT(" H^",?H), "") as ?Hx)
BIND(IF(?T != "0", CONCAT(" T^",?T), "") as ?Tx)
BIND(IF(?D != "0", CONCAT(" D^",?D), "") as ?Dx)
BIND(CONCAT("\\(", REPLACE(CONCAT( ?Ax, ?Ex, ?Lx, ?Ix, ?Mx, ?Hx, ?Tx), "\\s+", " "), "\\)") as ?latexString)
BIND(STRDT(?latexString, qudt:LatexString) as ?latexDefinition)
}
So close! Can you remove the quotes from the numbers?
...or just go to bed!
(done)
I have another PR coming up once the data is reserialized, Not sure if I can swing it immediately but I believe it's a lot again, might be worth waiting for that before you release.
The PR would be about missing/wrong conversion multipliers in derived units.
@fkleedorfer, tremendous job! Zero validation errors. I have made some notes to go back later to consider some of the issues you raised about the need for some similar/related quantity kinds, but I think that can wait for now. I plan to go ahead with publishing a new release on Monday, because otherwise the holidays will overtake me for the December release. So don't feel pressured to rush through the derived units work.
Again, thanks so much for the considerable work you have put in here!
Work in progress