qudt / qudt-public-repo

QUDT -Quantities, Units, Dimensions and dataTypes - public repository
Other
108 stars 69 forks source link

Draft: Fix dimension vectors #837

Closed fkleedorfer closed 6 months ago

fkleedorfer commented 7 months ago

Work in progress

steveraysteveray commented 7 months ago

@fkleedorfer, I have updated the vocabulary files so that they pass all the validation tests in #838, which has been pushed to the main branch. So now we have a good new starting point. Would you mind merging main into this branch and resolving any merge conflicts? Then I will be able to evaluate this and the other PRs more easily.

Thanks.

fkleedorfer commented 6 months ago

Rebased onto main

steveraysteveray commented 6 months ago

Here's a status update from the validation run - down to about 10 units now! image

I'll try to get back to reviewing your suggestions, but I'm a bit slammed at present.

fkleedorfer commented 6 months ago

Thanks.

For the record, here is how I run shacl validation (using a local installation of jena CLI tools):

cat vocab/unit/VOCAB_QUDT-UNITS-ALL-v2.1.ttl  vocab/quantitykinds/VOCAB_QUDT-QUANTITY-KINDS-ALL-v2.1.ttl schema/SCHEMA_QUDT-v2.1.ttl  schema/SCHEMA_QUDT-DATATYPE-v2.1.ttl > tmp.ttl &&  shacl validate --shapes collections/COLLECTION_QUDT_QA_TESTS_ALL-v2.1.ttl --data tmp.ttl

Would it make sense to document that somewhere, maybe schema/shacl/README.md ?

fkleedorfer commented 6 months ago

Fixed all SHACL problems and made the last required decisions. I'd say it's done pending a review and then removing the analysis comments. Note that code-comments have been shifted a bit by subsequent commits - most are still roughly in the right spot, but one or two were shifted so far they ended up in the triples of a different unit.

steveraysteveray commented 6 months ago

Getting very close now! Here's the validation result. Most seem clear what to do. The last one is because the referenced qk has two :: image

I was going to publish the new release today, but if you think you can get these done I can wait until Monday...

steveraysteveray commented 6 months ago

Also, don't worry about removing the comments, as they will all be removed by TopBraid Composer when I generate the new release.

fkleedorfer commented 6 months ago

What about those qkdv triples - are they auto-generated with the release?

steveraysteveray commented 6 months ago

Unfortunately not. (You raise a good point, though, that they could be).

fkleedorfer commented 6 months ago

ok. So taking a random dv:

qkdv:A0E2L0I0M-1H0T4D0
  a qudt:QuantityKindDimensionVector_ISO ;
  a qudt:QuantityKindDimensionVector_Imperial ;
  a qudt:QuantityKindDimensionVector_SI ;
  qudt:dimensionExponentForAmountOfSubstance 0 ;
  qudt:dimensionExponentForElectricCurrent 2 ;
  qudt:dimensionExponentForLength 0 ;
  qudt:dimensionExponentForLuminousIntensity 0 ;
  qudt:dimensionExponentForMass -1 ;
  qudt:dimensionExponentForThermodynamicTemperature 0 ;
  qudt:dimensionExponentForTime 4 ;
  qudt:dimensionlessExponent 0 ;
  qudt:latexDefinition "\\(M^-1 T^4 I^2\\)"^^qudt:LatexString ;
  rdfs:isDefinedBy <http://qudt.org/2.1/vocab/dimensionvector> ;
  rdfs:label "A0E2L0I0M-1H0T4D0" ;
.

, how do I automate this?

  1. I have a pretty good idea of how to get to the exponent triples (throwing regex at the problem)
  2. I can generate the label
  3. I can generate isDefinedBy
  4. but how do I know the classes? Are they all always applicable?
steveraysteveray commented 6 months ago

Yes, they are all always applicable. It's only the weird CGS systems that have different base dimensions.

fkleedorfer commented 6 months ago

I probably won't manage the latex def. Is that a problem?

steveraysteveray commented 6 months ago

Let's go ahead without, and we can back-fill later. We don't validate against that at present.

fkleedorfer commented 6 months ago

ah, got it. Here's the query:

prefix qudt:<http://qudt.org/schema/qudt/>
prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
construct {
    ?dv
      a qudt:QuantityKindDimensionVector_ISO ;
      a qudt:QuantityKindDimensionVector_Imperial ;
      a qudt:QuantityKindDimensionVector_SI ;
      qudt:dimensionExponentForAmountOfSubstance ?A ;
      qudt:dimensionExponentForElectricCurrent ?E ;
      qudt:dimensionExponentForLength ?L ;
      qudt:dimensionExponentForLuminousIntensity ?I ;
      qudt:dimensionExponentForMass ?M ;
      qudt:dimensionExponentForThermodynamicTemperature ?H ;
      qudt:dimensionExponentForTime ?T ;
      qudt:dimensionlessExponent ?D ;
      qudt:latexDefinition ?latexDefinition;
      rdfs:isDefinedBy <http://qudt.org/2.1/vocab/dimensionvector> ;
      rdfs:label ?dvLocalname ;
}
#select *
where
{
     ?entity qudt:hasDimensionVector ?dv
     optional {
        ?dv ?any ?object
     }
     filter (!(bound(?object)))
     BIND(REPLACE(STR(?dv), "^.+/", "") as ?dvLocalname)
     BIND(REPLACE(?dvLocalname, "A(-?\\d)E-?\\dL-?\\dI-?\\dM-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?A)
     BIND(REPLACE(?dvLocalname, "A-?\\dE(-?\\d)L-?\\dI-?\\dM-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?E)
     BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL(-?\\d)I-?\\dM-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?L)
     BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI(-?\\d)M-?\\dH-?\\dT-?\\dD-?\\d", "$1") as ?I)
     BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM(-?\\d)H-?\\dT-?\\dD-?\\d", "$1") as ?M)
     BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM-?\\dH(-?\\d)T-?\\dD-?\\d", "$1") as ?H)
     BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM-?\\dH-?\\dT(-?\\d)D-?\\d", "$1") as ?T)
     BIND(REPLACE(?dvLocalname, "A-?\\dE-?\\dL-?\\dI-?\\dM-?\\dH-?\\dT-?\\dD(-?\\d)", "$1") as ?D)
     BIND(CONCAT("http://qudt.org/vocab/dimensionvector/", "A",?A, "E", ?E, "L",?L, "I", ?I, "M",?M, "H",?H,"T",?T,"D",?D) = str(?dv) as ?extractionCorrect)
     BIND(IF(?A != "0", CONCAT(" A^",?A), "") as ?Ax)
     BIND(IF(?E != "0", CONCAT(" E^",?E), "") as ?Ex)
     BIND(IF(?L != "0", CONCAT(" L^",?L), "") as ?Lx)
     BIND(IF(?I != "0", CONCAT(" I^",?I), "") as ?Ix)
     BIND(IF(?M != "0", CONCAT(" M^",?M), "") as ?Mx)
     BIND(IF(?H != "0", CONCAT(" H^",?H), "") as ?Hx)
     BIND(IF(?T != "0", CONCAT(" T^",?T), "") as ?Tx)
     BIND(IF(?D != "0", CONCAT(" D^",?D), "") as ?Dx)
     BIND(CONCAT("\\(", REPLACE(CONCAT( ?Ax, ?Ex, ?Lx, ?Ix, ?Mx, ?Hx, ?Tx), "\\s+", " "), "\\)") as ?latexString)
     BIND(STRDT(?latexString, qudt:LatexString) as ?latexDefinition)
 }
steveraysteveray commented 6 months ago

So close! Can you remove the quotes from the numbers?

image

...or just go to bed!

fkleedorfer commented 6 months ago

(done)

I have another PR coming up once the data is reserialized, Not sure if I can swing it immediately but I believe it's a lot again, might be worth waiting for that before you release.

The PR would be about missing/wrong conversion multipliers in derived units.

steveraysteveray commented 6 months ago

@fkleedorfer, tremendous job! Zero validation errors. I have made some notes to go back later to consider some of the issues you raised about the need for some similar/related quantity kinds, but I think that can wait for now. I plan to go ahead with publishing a new release on Monday, because otherwise the holidays will overtake me for the December release. So don't feel pressured to rush through the derived units work.

Again, thanks so much for the considerable work you have put in here!