Closed lewismc closed 4 years ago
Due to existing prefixes such as pstate, sstate and state I did not find a way to automate this without introducing moire errors. This was done manually!
I'll update this PR soon. I'm about 50% through! yikes.
This PR is a beast but it is not complex. It just requires thorough peer review. It's not yet finished. I will indicate here once it is ready for review.
I'm adding this here so I don't forget.
I have noticed the past several files have the NS declaration for /rela/ changed, but the actual NS in the file remains 'rela:'.
Also, /relaTime/ has the NS declaration changed to 'sorelt:', but the namespace in the file is 'tsorel:' Will need to re-check the 50 or so files I already 'verified' as I didn't notice that until now.
@brandonnodnarb what is the current status here? It looks like there are still quite a few files to be addressed... is this a shared understanding?
Thanks for all of the contributions. Really appreciated.
@lewismc I believe I've checked all /matr and /state files. These should be good to go.
I've been working in /phen* and should finish tomorrow (late).
I checked /human*, but I'll need to go through those again (hopefully quickly) to make sure I accounted for a few random bits I hadn't checked for initially.
I should be able to get through at least another section/topic by end of week.
I think I've sorted automating the remainder. I'll test on a local copy in the morning.
Hi @brandonnodnarb I've taken a pass through the entire ontology suite twice. My most recent commit reflects a few bugs I found, removal of unused prefixes from some files and updates of every prefix to reflect what we have in the SHACL file. What a task... thankl you very much for tackling a huge portion of this as well. That took just over one week to do. Please let me know if and when you are happy with this. Honestly, I think we would be better of inviting @ESIPFed/semtech to review it as well.
latest pushed resolved the following:
anim in matrAnimal.ttl biol in humanKnowledgeDomain.ttl biol in matrAnimal.ttl biol in matrBiomass.ttl com in phenHydro.ttl com in humanEnvirConservation.ttl com in humanEnvirControl.ttl con in realmGeolConstituent.ttl con in humanEnvirControl.ttl dir in reprSpaceDirection.ttl dir in reprSpaceCoordinate.ttl graph in humanDecision.ttl hum in matrEquipment.ttl hum in humanJurisdiction.ttl human in matrWater.ttl human in humanEnvirConservation.ttl human in humanDecision.ttl jur in humanEnvirStandards.ttl jur in humanJurisdiction.ttl land in humanEnvirStandards.ttl land in realmLandOrographic.ttl land in realmLandGlacial.ttl land in realmLandTectonic.ttl land in realmLandCoastal.ttl land in realmLandAeolian.ttl land in realmLandFluvial.ttl land in realmLandform.ttl matr in realmGeolContinental.ttl oper in humanDecision.ttl phen in phenFluidTransport.ttl realm in matrWater.ttl realm in phenEnvirImpact.ttl realm in matrNaturalResource.ttl realm in phenHydro.ttl realm in matrEquipment.ttl realm in humanEnvirConservation.ttl realm in realm.ttl rela: (23 ttl files) repr in repr.ttl repr in humanResearch.ttl repr in humanTechReadiness.ttl res in humanEnvirAssessment.ttl res in humanResearch.ttl res in reprDataProduct.ttl srela2 in reprSpaceCoordinate.ttl time in realmClimateZone.ttl trela: in humanKnowledgeDomain.ttl xten2 in matrWater.ttl
I’m also +1 on merging this. Let’s wait another 72 hrs minimum before moving ahead. Thanks Brandon
@carueda this file may help: SWEET-filename_oldns_uri_newns.txt
Re sweetAll.ttl
(just starting with the easiest one for now ;)
Nothing critical, but:
soall
is not usedxml
, xsd
xml:
is used in only two places: in catalog-v001.xml, and for a language tag in sweet.owl. Not needed as a general prefix.
xsd:
is used in a bunch of places for datatypes. Leave it be.
soall:
prefix is not needed.
@dr-shorthair lets address this in a separate ticket. xml:
is used in hundreds of places.
(re xml:
all except two are merely the PREFIX: declaration)
With the help of check_isomorphic.sc (which I just added to sweet-tools), I ran a comparison between the files under master branch (under src/
below in my local machine) and corresponding ones from lewismc:ISSUE-163 (under src_branch/
below).
The comparison is based on Jena's isIsomorphicWith method.
As you can see in the report below, a couple of files fail the isIsomorphicWith check, and some others cannot be loaded due to errors. I haven't actually reviewed the affected files per se yet, but hope this helps in the meantime.
$ ./check_isomorphic.sc ../../sweet/src ../../sweet/src_branch
- human.ttl √
- humanAgriculture.ttl √
- humanCommerce.ttl √
- humanDecision.ttl
ERROR: src_branch/humanDecision.ttl: [line: 227, col: 1 ] Undefined prefix: prop
- humanEnvirAssessment.ttl √
- humanEnvirConservation.ttl
ERROR: src_branch/humanEnvirConservation.ttl: [line: 104, col: 1 ] Undefined prefix: prop
- humanEnvirControl.ttl √
- humanEnvirStandards.ttl √
- humanJurisdiction.ttl √
- humanKnowledgeDomain.ttl √
- humanResearch.ttl √
- humanTechReadiness.ttl √
- humanTransportation.ttl √
- matr.ttl √
- matrAerosol.ttl √
- matrAnimal.ttl √
- matrBiomass.ttl √
- matrCompound.ttl √
- matrElement.ttl √
- matrElementalMolecule.ttl √
- matrEnergy.ttl √
- matrEquipment.ttl √
- matrFacility.ttl √
- matrIndustrial.ttl √
- matrInstrument.ttl √
- matrIon.ttl √
- matrIsotope.ttl √
- matrMicrobiota.ttl √
- matrMineral.ttl √
- matrNaturalResource.ttl √
- matrOrganicCompound.ttl √
- matrParticle.ttl √
- matrPlant.ttl √
- matrRock.ttl √
- matrRockIgneous.ttl √
- matrSediment.ttl √
- matrWater.ttl √
- phen.ttl √
- phenAtmo.ttl √
- phenAtmoCloud.ttl √
- phenAtmoFog.ttl √
- phenAtmoFront.ttl √
- phenAtmoLightning.ttl √
- phenAtmoPrecipitation.ttl √
- phenAtmoPressure.ttl √
- phenAtmoSky.ttl √
- phenAtmoTransport.ttl √
- phenAtmoWind.ttl √
- phenAtmoWindMesoscale.ttl √
- phenBiol.ttl √
- phenCryo.ttl √
- phenCycle.ttl √
- phenCycleMaterial.ttl √
- phenEcology.ttl √
- phenElecMag.ttl √
- phenEnergy.ttl √
- phenEnvirImpact.ttl √
- phenFluidDynamics.ttl √
- phenFluidInstability.ttl √
- phenFluidTransport.ttl √
- phenGeol.ttl √
- phenGeolFault.ttl √
- phenGeolGeomorphology.ttl √
- phenGeolSeismicity.ttl √
- phenGeolTectonic.ttl √
- phenGeolVolcano.ttl √
- phenHelio.ttl √
- phenHydro.ttl NOT ISOMORPHIC
- phenMixing.ttl √
- phenOcean.ttl √
- phenOceanCoastal.ttl √
- phenOceanDynamics.ttl √
- phenPlanetClimate.ttl √
- phenReaction.ttl √
- phenSolid.ttl √
- phenStar.ttl √
- phenSystem.ttl √
- phenSystemComplexity.ttl √
- phenWave.ttl √
- phenWaveNoise.ttl √
- proc.ttl √
- procChemical.ttl √
- procPhysical.ttl √
- procStateChange.ttl √
- procWave.ttl √
- prop.ttl √
- propBinary.ttl √
- propCapacity.ttl √
- propCategorical.ttl √
- propCharge.ttl √
- propChemical.ttl √
- propConductivity.ttl √
- propCount.ttl √
- propDifference.ttl √
- propDiffusivity.ttl √
- propDimensionlessRatio.ttl √
- propEnergy.ttl √
- propEnergyFlux.ttl √
- propFraction.ttl √
- propFunction.ttl √
- propIndex.ttl √
- propMass.ttl √
- propMassFlux.ttl √
- propOrdinal.ttl √
- propPressure.ttl √
- propQuantity.ttl √
- propRotation.ttl
ERROR: src_branch/propRotation.ttl: [line: 8, col: 9 ] @prefix or PREFIX requires a prefix (found '[KEYWORD:soproptf]')
- propSpace.ttl √
- propSpaceDirection.ttl √
- propSpaceDistance.ttl √
- propSpaceHeight.ttl √
- propSpaceLocation.ttl √
- propSpaceMultidimensional.ttl √
- propSpaceThickness.ttl √
- propSpeed.ttl √
- propTemperature.ttl √
- propTemperatureGradient.ttl √
- propTime.ttl √
- propTimeFrequency.ttl √
- realm.ttl √
- realmAstroBody.ttl √
- realmAstroHelio.ttl √
- realmAstroStar.ttl √
- realmAtmo.ttl √
- realmAtmoBoundaryLayer.ttl √
- realmAtmoWeather.ttl √
- realmBiolBiome.ttl √
- realmClimateZone.ttl
ERROR: src_branch/realmClimateZone.ttl: [line: 535, col: 1 ] Undefined prefix: sorept
- realmCryo.ttl √
- realmEarthReference.ttl √
- realmGeol.ttl √
- realmGeolBasin.ttl √
- realmGeolConstituent.ttl
ERROR: src_branch/realmGeolConstituent.ttl: [line: 27, col: 1 ] Undefined prefix: soreagcons
- realmGeolContinental.ttl √
- realmGeolOceanic.ttl √
- realmGeolOrogen.ttl √
- realmHydro.ttl √
- realmHydroBody.ttl √
- realmLandAeolian.ttl √
- realmLandCoastal.ttl √
- realmLandFluvial.ttl √
- realmLandGlacial.ttl √
- realmLandOrographic.ttl √
- realmLandProtected.ttl √
- realmLandTectonic.ttl √
- realmLandVolcanic.ttl √
- realmLandform.ttl √
- realmOcean.ttl √
- realmOceanFeature.ttl √
- realmOceanFloor.ttl √
- realmRegion.ttl √
- realmSoil.ttl NOT ISOMORPHIC
- rela.ttl √
- relaChemical.ttl √
- relaClimate.ttl √
- relaHuman.ttl √
- relaMath.ttl
ERROR: src_branch/relaMath.ttl: [line: 35, col: 1 ] Undefined prefix: sorelm
- relaPhysical.ttl √
- relaProvenance.ttl √
- relaSci.ttl √
- relaSpace.ttl √
- relaTime.ttl √
- repr.ttl NOT ISOMORPHIC
- reprDataFormat.ttl √
- reprDataModel.ttl √
- reprDataProduct.ttl √
- reprDataService.ttl √
- reprDataServiceAnalysis.ttl √
- reprDataServiceGeospatial.ttl √
- reprDataServiceReduction.ttl √
- reprDataServiceValidation.ttl √
- reprMath.ttl √
- reprMathFunction.ttl √
- reprMathFunctionOrthogonal.ttl √
- reprMathGraph.ttl √
- reprMathOperation.ttl √
- reprMathSolution.ttl √
- reprMathStatistics.ttl √
- reprSciComponent.ttl √
- reprSciFunction.ttl √
- reprSciLaw.ttl √
- reprSciMethodology.ttl √
- reprSciModel.ttl √
- reprSciProvenance.ttl √
- reprSciUnits.ttl √
- reprSpace.ttl √
- reprSpaceCoordinate.ttl √
- reprSpaceDirection.ttl NOT ISOMORPHIC
- reprSpaceGeometry.ttl √
- reprSpaceGeometry3D.ttl √
- reprSpaceReferenceSystem.ttl √
- reprTime.ttl √
- reprTimeDay.ttl √
- reprTimeSeason.ttl √
- state.ttl √
- stateBiological.ttl √
- stateChemical.ttl √
- stateDataProcessing.ttl √
- stateEnergyFlux.ttl √
- stateFluid.ttl √
- stateOrdinal.ttl √
- statePhysical.ttl √
- stateRealm.ttl √
- stateRole.ttl √
- stateRoleBiological.ttl √
- stateRoleChemical.ttl √
- stateRoleGeographic.ttl √
- stateRoleImpact.ttl √
- stateRoleRepresentative.ttl √
- stateRoleTrust.ttl √
- stateSolid.ttl √
- stateSpace.ttl √
- stateSpaceConfiguration.ttl √
- stateSpaceScale.ttl √
- stateSpectralBand.ttl √
- stateSpectralLine.ttl √
- stateStorm.ttl √
- stateSystem.ttl √
- stateThermodynamic.ttl √
- stateTime.ttl √
- stateTimeCycle.ttl √
- stateTimeFrequency.ttl √
- stateTimeGeologic.ttl √
- stateVisibility.ttl √
- sweetAll.ttl √
215 isomorphic files out of 225
This is beautiful @carueda I'll go in an make the necessary changes and push an update.
Current issues reduced to 3
lmcgibbn@MT-207576 ~/Downloads/sweet-tools/sc(master) $ ./check_isomorphic.sc ../../sweet/src/ ../../sweet_orig/src/
...
- phenHydro.ttl NOT ISOMORPHIC
- realmSoil.ttl NOT ISOMORPHIC
- reprSpaceDirection.ttl NOT ISOMORPHIC
...
222 isomorphic files out of 225
I tried to manually check phenHydro.ttl and realmSoil.ttl with no luck. It turns out that reprSpaceDirection.ttl has had some unused prefixes removed which seems to have screwed with things as well. Not long to go now.
@lewismc Looks like those are false negatives .. EDIT: scratch that! I had a typo in my script!
I just included a "diff" report based on the n-triples version of the model for each failed isomorphic check.
$ ./check_isomorphic.sc ../../sweet/src ../../sweet/src_branch | grep 'NOT ISO'
- phenHydro.ttl NOT ISOMORPHIC, see phenHydro.ttl.diff
- realmSoil.ttl NOT ISOMORPHIC, see realmSoil.ttl.diff
- reprSpaceDirection.ttl NOT ISOMORPHIC, see reprSpaceDirection.ttl.diff
Each .diff will show the triples that are not common between the compared models.
Example, phenHydro.ttl.diff
:
2 triples in 1st model but not in the 2nd:
<http://sweetontology.net/realmHydro/Aquifer> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://sweetontology.net/realmHydro/UndergroundWater> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
2 triples in 2nd model but not in the 1st:
<http://sweetontology.net/realm/Aquifer> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://sweetontology.net/realm/UndergroundWater> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
Re realmSoil.ttl
it seems that the diff is because of the blank nodes (which is not a surprise).
However, for triples not involving blank nodes I can only see this diff:
<http://sweetontology.net/realmSoil/SoilOrder> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://sweetontology.net/propCategorical/Classification> .
<http://sweetontology.net/realmSoil/SoilOrder> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://sweetontology.net/realmSoil/Classification> .
Re phenHydro.ttl
: (again, ignoring blank nodes)
<http://sweetontology.net/realmHydro/Aquifer> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://sweetontology.net/realm/Aquifer> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
and
<http://sweetontology.net/realmHydro/UndergroundWater> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
<http://sweetontology.net/realm/UndergroundWater> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/2002/07/owl#Class> .
So, reprSpaceDirection.ttl
is already fixed and pushed to this PR.
EDIT: I mean, in terms of both versions being isomorphic (more concretely, identical in terms of the n-triples representation -- I haven't looked at the prefixes themselves.)
Hi @carueda please pull most recent commit locally then re-run ./check_isomorphic.sc
and post your diff result for realmSoil.ttl
please. Thanks
<http://sweetontology.net/realmSoil/SoilOrder> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://sweetontology.net/propCategorical/Classification> .
<http://sweetontology.net/realmSoil/SoilOrder> <http://www.w3.org/2000/01/rdf-schema#subClassOf> <http://sweetontology.net/realmSoil/Classification> .
OK I fixed the final issue. It looks like it was down to another grep replacement gone wrong... but that was to expected.
I'm +1 on merging.
Does anyone have further comments on this PR? We've not had any further peer review in around a week. I would like to get working on #169 and this is basically blocking that now.
Thanks everyone for the review and contributions. This was pretty major!
First batch of updates to address #163
This only updates **state*** and sweetAll