Depart-de-Sentier / brightcon-2024-material

Talks and presentation materials from the Brightcon 2024 conference and hackathon
8 stars 10 forks source link

Semantic unit conversion app using Sentier.dev platform #1

Open cmutel opened 2 months ago

cmutel commented 2 months ago

Overview

We have a database with units, code for a specific type of unit conversion, and the need for a general API for converting units in the future.

User stories

Let's build a webapp that can do the following:

And finally, tie all this together so you can start typing a unit, pick the right one from a dropdown, and then get tables of conversion factors for each system we include.

Unit systems

The unit systems we already have:

Unit systems I would like to have:

Tasks

Stretch goals:

Skosmos search

This is possibly hard. We have a search index via skosmos (which should also have an API), but it only searches on prefLabel (see search result for btu versus british), and maybe on altLabel. We are currently using notation ("Notations are symbols which are not normally recognizable as words or sequences of words in any natural language and are thus usable independently of natural-language contexts"), but we could change these to altLabel, or add altLabel in addition to notation (there are strings, even if they have custom data types, so should be fine for being instances of RDF plain literal).

cmutel commented 2 months ago

Preliminary plan is to develop a new UI and API using React and FastAPI, and to have our own search index using something like ElasticSearch. The reason we chose not to build on Skosmos is that we can move more quickly by building a more targeted user experience with specific and complicated Sparql queries, and that we want people to think about building apps on top of our data products (this is a good example).

We have three API endpoints in mind:

cmutel commented 2 months ago

Hackathon team:

cmutel commented 2 months ago

Quick update from my side: We have an initial unit endpoint available (PR), and this pulls all data for all units of the same quantity kind as the input unit.

The output is a JSON Map with keys of unit IRIS and values of lists of (attribute, value). This needs to be a list because the same attribute can be present more than once. Here is an example:

{
    "https://vocab.sentier.dev/qudt/unit/M-SEC": [
        [
            "type",
            "Concept"
        ],
        [
            "prefLabel",
            "Metre second"
        ],
        [
            "prefLabel",
            "Meter second"
        ],
        [
            "notation",
            "ms"
        ],
        [
            "notation",
            "m.s"
        ],
        [
            "inScheme",
            "https://vocab.sentier.dev/qudt/"
        ],
        [
            "broader",
            "https://vocab.sentier.dev/qudt/quantity-kind/LengthTime"
        ],
        [
            "narrower",
            "https://vocab.sentier.dev/qudt/unit/M-YR"
        ],
        [
            "definition",
            "Meter over one second"
        ],
        [
            "broaderTransitive",
            "https://vocab.sentier.dev/qudt/quantity-kind/LengthTime"
        ],
        [
            "narrowerTransitive",
            "https://vocab.sentier.dev/qudt/unit/M-YR"
        ],
        [
            "hasDimensionVector",
            "http://qudt.org/vocab/dimensionvector/A0E0L1I0M0H0T1D0"
        ],
        [
            "applicableSystem",
            "http://qudt.org/vocab/sou/SI"
        ],
        [
            "applicableSystem",
            "http://qudt.org/vocab/sou/CGS"
        ],
        [
            "conversionMultiplier",
            "1.0"
        ],
        [
            "conversionMultiplierSN",
            "1.0e0"
        ],
        [
            "hasQuantityKind",
            "https://vocab.sentier.dev/qudt/quantity-kind/LengthTime"
        ]
    ]
}

We could imagine having two tables, one with real units (anything not IMPERIAL, PLANCK, USCS), and the other one with the weird stuff.

We can place restrictions on the languages of the string literals returned, see the API docs. I think we need to do this as we can have more than one prefLabel for each concept, and then the UI doesn't know which one to display. In the above case, one has the language string en_GB and the other en_US (this was me being a bit pedantic 😛, but also trying to improve the search data).

My initial idea was that we would display different tables with the conversion factors, with a separate table for each alternative system, such as SimaPro, ecoinvent, etc. I now think that that is a bad idea. We want to support interoperability but also encourage harmonisation to a common standard. So instead I think we should only have a single table, like:

Label Synonyms IRI (click to copy) SimaPro Ecoinvent LCA Commons
Kilogram kg, KGM https://vocab.sentier.dev/qudt/unit/KiloGM kg kg kg

This has implications for the database. I came to this conclusion because I was starting with ecoinvent data, and didn't want to create a separate system for them the way we did for SimaPro.

Please provide feedback so I am not yelling into the void 📢

janfeitkenhauer commented 2 months ago

Good thing, you added the JSON response. I will work with that until the server is up and running and the endpoint can be accessed from the frontend.

Also, I like the idea to display only one table. It is clean and easy to grasp for the user, without restrictions to what system they are using. If we find, that information is missing, we can easily adapt. We should definitely add language restrictions!

janfeitkenhauer commented 2 months ago

So, on my way home I thought about the data structure of the JSON response.. The first positions should contain units of the metric system aka International System of Units, always starting with the reference unit. (Kudos to those who are not using the metric system for the 7 dimensions included. You exceed my level of skill and therefore are very able to look further down for your unit. 🤓)

Symbol Name Quantity
s second time
m metre length
kg kilogram mass
A ampere electric current
K kelvin thermodynamic temperature
mol mole amount of substance
cd candela luminous intensity

All units around the base units (like mega, kilo, milli, etc.) should be displayed below the reference unit, unordered. Below them, all the rest, unordered. There are additional units (e.g. velocity) which should work with the same principle. Please ask for clarification, if necessary.

To answer the question of how many unit pages we need, we agreed on a separate endpoint, that provides an array of all base units to be considered or something similar. With the response the frontend should be able to render the unit pages dynamically.

Thats it from me for today. In the upcoming days I will refine the frontend and also commit the code on github. For the initial commit I'd like some support as to where to put the client data, as you guys have already made the commits for the backend.

Cheers!

janfeitkenhauer commented 1 month ago

Updated JSON response.


{
    "https://vocab.sentier.dev/qudt/unit/AMU": {
        "type": "Concept",
        "prefLabel": "Atomic mass unit",
        "notation": [
            "amu",
            "u",
            "D43"
        ],
        "inScheme": "https://vocab.sentier.dev/qudt/",
        "broader": "https://vocab.sentier.dev/qudt/unit/KiloGM",
        "definition": "The $\\textit{Unified Atomic Mass Unit}$ (symbol: $\\mu$) or $\\textit{dalton}$ (symbol: Da) is a unit that is used for indicating mass on an atomic or molecular scale. It is defined as one twelfth of the rest mass of an unbound atom of carbon-12 in its nuclear and electronic ground state, and has a value of $1.660538782(83) \\times 10^{-27} kg$.  One $Da$ is approximately equal to the mass of one proton or one neutron. The CIPM have categorised it as a $\\textit{\"non-SI unit whose values in SI units must be obtained experimentally\"}$.",
        "broaderTransitive": [
            "https://vocab.sentier.dev/qudt/unit/KiloGM",
            "https://vocab.sentier.dev/qudt/quantity-kind/Mass"
        ],
        "exactMatch": "http://qudt.org/vocab/unit/AMU",
        "hasDimensionVector": "http://qudt.org/vocab/dimensionvector/A0E0L0I0M1H0T0D0",
        "informativeReference": "http://en.wikipedia.org/wiki/Atomic_mass_unit",
        "conversionMultiplier": "0.00000000000000000000000000166053878283",
        "conversionMultiplierSN": "1.660539E-27",
        "hasQuantityKind": "https://vocab.sentier.dev/qudt/quantity-kind/Mass"
    },
    "https://vocab.sentier.dev/qudt/unit/KiloGM": {
        "type": "Concept",
        "prefLabel": [
            "كيلوغرام",
            "килограм",
            "kilogram",
            "Kilogramm",
            "χιλιόγραμμο",
            "kilogram",
            "kilogramo",
            "کیلوگرم",
            "kilogramme",
            "קילוגרם",
            "किलोग्राम",
            "kilogramm*",
            "chilogrammo",
            "キログラム",
            "chiliogramma",
            "kilogram",
            "kilogram",
            "quilograma",
            "kilogram",
            "килограмм",
            "kilogram",
            "kilogram",
            "公斤"
        ],
        "notation": [
            "0112/2///62720#UAA594",
            "0112/2///62720#UAD720",
            "kg",
            "KGM"
        ],
        "inScheme": "https://vocab.sentier.dev/qudt/",
        "broader": "https://vocab.sentier.dev/qudt/quantity-kind/Mass",
        "related": "http://dbpedia.org/resource/Kilogram",
        "narrower": [
            "https://vocab.sentier.dev/qudt/unit/AMU",
            "https://vocab.sentier.dev/qudt/unit/CARAT",
            "https://vocab.sentier.dev/qudt/unit/CWT_LONG",
            "https://vocab.sentier.dev/qudt/unit/CWT_SHORT",
            "https://vocab.sentier.dev/qudt/unit/CentiGM",
            "https://vocab.sentier.dev/qudt/unit/DRAM_UK",
            "https://vocab.sentier.dev/qudt/unit/DRAM_US",
            "https://vocab.sentier.dev/qudt/unit/DWT",
            "https://vocab.sentier.dev/qudt/unit/DecaGM",
            "https://vocab.sentier.dev/qudt/unit/DeciGM",
            "https://vocab.sentier.dev/qudt/unit/DeciTONNE",
            "https://vocab.sentier.dev/qudt/unit/DeciTON_Metric",
            "https://vocab.sentier.dev/qudt/unit/EarthMass",
            "https://vocab.sentier.dev/qudt/unit/FemtoGM",
            "https://vocab.sentier.dev/qudt/unit/GM",
            "https://vocab.sentier.dev/qudt/unit/GRAIN",
            "https://vocab.sentier.dev/qudt/unit/HectoGM",
            "https://vocab.sentier.dev/qudt/unit/Hundredweight_UK",
            "https://vocab.sentier.dev/qudt/unit/Hundredweight_US",
            "https://vocab.sentier.dev/qudt/unit/KiloTONNE",
            "https://vocab.sentier.dev/qudt/unit/KiloTON_Metric",
            "https://vocab.sentier.dev/qudt/unit/LB",
            "https://vocab.sentier.dev/qudt/unit/LB_M",
            "https://vocab.sentier.dev/qudt/unit/LB_T",
            "https://vocab.sentier.dev/qudt/unit/LunarMass",
            "https://vocab.sentier.dev/qudt/unit/MOMME_Pearl",
            "https://vocab.sentier.dev/qudt/unit/MOMME_Textile",
            "https://vocab.sentier.dev/qudt/unit/MegaGM",
            "https://vocab.sentier.dev/qudt/unit/MegaTON",
            "https://vocab.sentier.dev/qudt/unit/MegaTONNE",
            "https://vocab.sentier.dev/qudt/unit/MicroGM",
            "https://vocab.sentier.dev/qudt/unit/MilliGM",
            "https://vocab.sentier.dev/qudt/unit/NanoGM",
            "https://vocab.sentier.dev/qudt/unit/OZ",
            "https://vocab.sentier.dev/qudt/unit/OZ_M",
            "https://vocab.sentier.dev/qudt/unit/OZ_TROY",
            "https://vocab.sentier.dev/qudt/unit/PFUND",
            "https://vocab.sentier.dev/qudt/unit/Pennyweight",
            "https://vocab.sentier.dev/qudt/unit/PicoGM",
            "https://vocab.sentier.dev/qudt/unit/PlanckMass",
            "https://vocab.sentier.dev/qudt/unit/Quarter_UK",
            "https://vocab.sentier.dev/qudt/unit/SLUG",
            "https://vocab.sentier.dev/qudt/unit/SolarMass",
            "https://vocab.sentier.dev/qudt/unit/Stone_UK",
            "https://vocab.sentier.dev/qudt/unit/TON",
            "https://vocab.sentier.dev/qudt/unit/TONNE",
            "https://vocab.sentier.dev/qudt/unit/TON_Assay",
            "https://vocab.sentier.dev/qudt/unit/TON_LONG",
            "https://vocab.sentier.dev/qudt/unit/TON_Metric",
            "https://vocab.sentier.dev/qudt/unit/TON_SHORT",
            "https://vocab.sentier.dev/qudt/unit/TON_UK",
            "https://vocab.sentier.dev/qudt/unit/TON_US",
            "https://vocab.sentier.dev/qudt/unit/U"
        ],
        "definition": "The kilogram or kilogramme (SI symbol: kg), also known as the kilo, is the base unit of mass in the International System of Units and is defined as being equal to the mass of the International Prototype Kilogram (IPK), which is almost exactly equal to the mass of one liter of water. The avoirdupois (or international) pound, used in both the Imperial system and U.S. customary units, is defined as exactly 0.45359237 kg, making one kilogram approximately equal to 2.2046 avoirdupois pounds.",
        "note": "The kilogram or kilogramme (SI symbol: kg), also known as the kilo, is the base unit of mass in the International System of Units and is defined as being equal to the mass of the International Prototype Kilogram (IPK), which is almost exactly equal to the mass of one liter of water. The avoirdupois (or international) pound, used in both the Imperial system and U.S. customary units, is defined as exactly 0.45359237 kg, making one kilogram approximately equal to 2.2046 avoirdupois pounds.",
        "broaderTransitive": [
            "https://vocab.sentier.dev/qudt/quantity-kind/Mass"
        ],
        "narrowerTransitive": [
            "https://vocab.sentier.dev/qudt/unit/AMU",
            "https://vocab.sentier.dev/qudt/unit/CARAT",
            "https://vocab.sentier.dev/qudt/unit/CWT_LONG",
            "https://vocab.sentier.dev/qudt/unit/CWT_SHORT",
            "https://vocab.sentier.dev/qudt/unit/CentiGM",
            "https://vocab.sentier.dev/qudt/unit/DRAM_UK",
            "https://vocab.sentier.dev/qudt/unit/DRAM_US",
            "https://vocab.sentier.dev/qudt/unit/DWT",
            "https://vocab.sentier.dev/qudt/unit/DecaGM",
            "https://vocab.sentier.dev/qudt/unit/DeciGM",
            "https://vocab.sentier.dev/qudt/unit/DeciTONNE",
            "https://vocab.sentier.dev/qudt/unit/DeciTON_Metric",
            "https://vocab.sentier.dev/qudt/unit/EarthMass",
            "https://vocab.sentier.dev/qudt/unit/FemtoGM",
            "https://vocab.sentier.dev/qudt/unit/GM",
            "https://vocab.sentier.dev/qudt/unit/GRAIN",
            "https://vocab.sentier.dev/qudt/unit/HectoGM",
            "https://vocab.sentier.dev/qudt/unit/Hundredweight_UK",
            "https://vocab.sentier.dev/qudt/unit/Hundredweight_US",
            "https://vocab.sentier.dev/qudt/unit/KiloTONNE",
            "https://vocab.sentier.dev/qudt/unit/KiloTON_Metric",
            "https://vocab.sentier.dev/qudt/unit/LB",
            "https://vocab.sentier.dev/qudt/unit/LB_M",
            "https://vocab.sentier.dev/qudt/unit/LB_T",
            "https://vocab.sentier.dev/qudt/unit/LunarMass",
            "https://vocab.sentier.dev/qudt/unit/MOMME_Pearl",
            "https://vocab.sentier.dev/qudt/unit/MOMME_Textile",
            "https://vocab.sentier.dev/qudt/unit/MegaGM",
            "https://vocab.sentier.dev/qudt/unit/MegaTON",
            "https://vocab.sentier.dev/qudt/unit/MegaTONNE",
            "https://vocab.sentier.dev/qudt/unit/MicroGM",
            "https://vocab.sentier.dev/qudt/unit/MilliGM",
            "https://vocab.sentier.dev/qudt/unit/NanoGM",
            "https://vocab.sentier.dev/qudt/unit/OZ",
            "https://vocab.sentier.dev/qudt/unit/OZ_M",
            "https://vocab.sentier.dev/qudt/unit/OZ_TROY",
            "https://vocab.sentier.dev/qudt/unit/PFUND",
            "https://vocab.sentier.dev/qudt/unit/Pennyweight",
            "https://vocab.sentier.dev/qudt/unit/PicoGM",
            "https://vocab.sentier.dev/qudt/unit/PlanckMass",
            "https://vocab.sentier.dev/qudt/unit/Quarter_UK",
            "https://vocab.sentier.dev/qudt/unit/SLUG",
            "https://vocab.sentier.dev/qudt/unit/SolarMass",
            "https://vocab.sentier.dev/qudt/unit/Stone_UK",
            "https://vocab.sentier.dev/qudt/unit/TON",
            "https://vocab.sentier.dev/qudt/unit/TONNE",
            "https://vocab.sentier.dev/qudt/unit/TON_Assay",
            "https://vocab.sentier.dev/qudt/unit/TON_LONG",
            "https://vocab.sentier.dev/qudt/unit/TON_Metric",
            "https://vocab.sentier.dev/qudt/unit/TON_SHORT",
            "https://vocab.sentier.dev/qudt/unit/TON_UK",
            "https://vocab.sentier.dev/qudt/unit/TON_US",
            "https://vocab.sentier.dev/qudt/unit/U"
        ],
        "exactMatch": [
            "http://qudt.org/vocab/unit/KiloGM",
            "https://si-digital-framework.org/SI/units/kilogram",
            "https://vocab.sentier.dev/simapro/unit/kg",
            "https://glossary.ecoinvent.org/ids/487df68b-4994-4027-8fdc-a4dc298257b7"
        ],
        "hasDimensionVector": "http://qudt.org/vocab/dimensionvector/A0E0L0I0M1H0T0D0",
        "informativeReference": "http://en.wikipedia.org/wiki/Kilogram?oldid=493633626",
        "applicableSystem": [
            "http://qudt.org/vocab/sou/SI",
            "http://qudt.org/vocab/sou/CGS",
            "http://qudt.org/vocab/sou/CGS-EMU",
            "http://qudt.org/vocab/sou/CGS-GAUSS"
        ],
        "conversionMultiplier": "1.0",
        "conversionMultiplierSN": "1.0e0",
        "hasQuantityKind": "https://vocab.sentier.dev/qudt/quantity-kind/Mass"
    },
    "https://vocab.sentier.dev/qudt/unit/CARAT": {
        "type": "Concept",
        "prefLabel": "Carat",
        "notation": [
            "0112/2///62720#UAB166",
            "ct",
            "[car_m]",
            "CTM"
        ],
        "inScheme": "https://vocab.sentier.dev/qudt/",
        "broader": "https://vocab.sentier.dev/qudt/unit/KiloGM",
        "related": "http://dbpedia.org/resource/Carat",
        "definition": "The carat is a unit of mass equal to 200 mg and is used for measuring gemstones and pearls. The current definition, sometimes known as the metric carat, was adopted in 1907 at the Fourth General Conference on Weights and Measures, and soon afterward in many countries around the world. The carat is divisible into one hundred points of two milligrams each. Other subdivisions, and slightly different mass values, have been used in the past in different locations. In terms of diamonds, a paragon is a flawless stone of at least 100 carats (20 g). The ANSI X.12 EDI standard abbreviation for the carat is $CD$.",
        "broaderTransitive": [
            "https://vocab.sentier.dev/qudt/unit/KiloGM",
            "https://vocab.sentier.dev/qudt/quantity-kind/Mass"
        ],
        "exactMatch": "http://qudt.org/vocab/unit/CARAT",
        "hasDimensionVector": "http://qudt.org/vocab/dimensionvector/A0E0L0I0M1H0T0D0",
        "informativeReference": "http://en.wikipedia.org/wiki/Carat?oldid=477129057",
        "applicableSystem": "http://qudt.org/vocab/sou/CGS",
        "conversionMultiplier": "0.0002",
        "conversionMultiplierSN": "2.0E-4",
        "hasQuantityKind": "https://vocab.sentier.dev/qudt/quantity-kind/Mass"
    }
}
jsvgoncalves commented 1 month ago

@janfeitkenhauer https://api.units.sentier.dev/v0_1/docs