SoftwareUnderstanding / software_types

Schema.org profile for software types
Apache License 2.0
6 stars 1 forks source link

How do you use this extension in html? #21

Open woutware opened 1 year ago

woutware commented 1 year ago

I'm new to schema.org, and I understand the basics and how to use the standard schema.org types. But how do you use this extension in html in a web site?

For a schema.org type like SoftwareApplication, you would refer to it as https://schema.org/SoftwareApplication:

<div itemscope itemtype="https://schema.org/SoftwareApplication">

So is it correct if I refer to software_type's SoftwareLibrary like this: https://w3id.org/software-types/SoftwareLibrary?

<div itemscope itemtype="https://w3id.org/software-types/SoftwareLibrary">

Googling around I also see https://w3id.org/software-types#SoftwareLibrary being used, so using the # character instead of the / character.

I've googled for a few hours for a tutorial on schema extensions, but I get lost in the w3c forest of murky documentation.

dgarijo commented 1 year ago

Hello, Thanks for your question.

The right link is the second one, with the hash character: https://w3id.org/software-types#SoftwareLibrary

You can see this in the .jsonld file in the root repositoru, or in the turtle file. Maybe we should clarify this in the readme further.

As for the intended usage, I believe you can use it in a div describing a software library. I normally use them to describe the whole page, but that is up to you.

El dom., 20 ago. 2023 3:47 a. m., woutware @.***> escribió:

I'm new to schema.org, and I understand the basics and how to use the standard schema.org types. But how do you use this extension in html in a web site?

For a schema.org type like SoftwareApplication, you would refer to it as https://schema.org/SoftwareApplication:

So is it correct if I referen to software_type's SoftwareLibrary like this: https://w3id.org/software-types/SoftwareLibrary?
Googling around I also see https://w3id.org/software-types#SoftwareLibrary being used, so using the # character instead of the / character. I've googled for a few hours for a tutorial on schema extensions, but I get lost in the w3c forest of murky documentation. — Reply to this email directly, view it on GitHub , or unsubscribe . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
woutware commented 1 year ago

Omg, you're here!!! Man, I'm dying here trying to understand how the schema.org extension mechanism works.

So suppose I want to do html + microdata (yes I know I should probably use a not outdated format, but I'm trying to gain understanding so bare with me!), would this be the way to do it?

<div itemscope itemtype="https://w3id.org/software-types#SoftwareLibrary">
    <div itempprop="name">Name of my software library</div>
    <div itempprop="executableName">mysoftwarelibrary.dll</div>
</div>

So the # comes from the this line in software-types.jsonld?

"stype": "https://w3id.org/software-types#"

And my final question: I imagine that all kinds of tools are able to verify that my microdata/json-ld/rdfa is correct. At least on schema.org I can kind of understand how this work, because you can navigate to schema.org/SoftwareApplication for example and check. But how does this work for this extension? Because I can't navigate to https://w3id.org/software-types#SoftwareLibrary for example and just see the definition of this type.

Edit: maybe I should do RDFa, as it's more modern?

<div vocab="https://w3id.org/software-types" typeof="SoftwareLibrary">
    <div property="name">Name of my software library</div>
    <div property="executableName">mysoftwarelibrary.dll</div>
</div>

By the way, when I google on these matters, I find nobody else doing this. I can't be the only one in the world trying to describe his software library in html? This all seems so obscure.

dgarijo commented 1 year ago

Hello, to answer your first question, I am not super familiar with microdata, but from my rdfa experience, your example looks good to me. What I usually do is inserting the JSON-LD directly in the HTML this way. Below is an example for a website describing an ontology:

<script type="application/ld+json">{"@context":"https://schema.org/","@type":"TechArticle","url":"https://w3id.org/ecfo#","image":"http://vowl.visualdataweb.org/webvowl/#iri=https://w3id.org/ecfo#","name":"The Emission Conversion Factor Ontology", "headline":"With the Net Zero agenda gaining significant traction across the world, organisations are often required to report carbon emissions associated with their operation. However, calculating emissions is not a trivial task and reported scores can differ depending on the choices made by those performing the calculations or the software used to assist with this task.The aim of this ontology is to formalise emission conversion factors, in order to make the process of emissions calculations more transparent.", "datePublished":"May 8th, 2023", "version":"1.0.0", "license":"https://creativecommons.org/licenses/by/4.0/"}</script>

If you replace the context with the stypes namespace, you can add all your software descriptions there.

You can validate your JSON-LD using https://json-ld.org/playground/

The # comes from the line you have identified, correct.

To answer your final question, we were a bit lazy with the documentation (terms don't have their own page), but resolving to https://w3id.org/software-types#SoftwareLibrary it takes you to the readme with the definition of SoftwareLibrary, does it not? Is there something you find confusing there? We can create an anchor for the term, but the definition is there.

The idea for stypes is that at some point we can merge it in Codemeta.

woutware commented 1 year ago

Hi again,

Thank you for helping me out, really appreciate it!

Ok, I will make a real world example in JSON-LD to clarify what I am trying to do. So here is what is how I would like to define my software library (loosely following the example on https://softwareunderstanding.github.io/software_types/release/1.0.0/, but without the use of the SoftwareSourceCode type as that doesn't apply I think):

{"@context": [
    "http://schema.org",
    "https://w3id.org/software-types"
  ],
  "@type": "SoftwareLibrary",
  "name": "CadLib",
  "description": "CadLib is a 100% .NET library to Read, write, view DWG/DXF files in C# VB .NET and convert to XAML, Pdf, PS, jpg, png.",
  "downloadUrl": "https://www.woutware.com/download/CadLib4.0TrialSetup.exe",
  "executableName": "CadLib4.0TrialSetup.exe"
}

I paste this in the https://json-ld.org/playground/, but the tool does not recognize the https://w3id.org/software-types url. It does however recognize the schema.org url. This is the error that is displayed:

jsonld.InvalidUrl: Dereferencing a URL did not result in a valid JSON-LD object. Possible causes are an inaccessible URL perhaps due to a same-origin policy (ensure the server uses CORS if you are using client-side JavaScript), too many redirects, a non-JSON response, or more than one HTTP Link Header was provided for a remote context.

This is what my previous questions were hinting on, how does the tooling recognize/validate that the json-ld/rdfa is correct or incorrect? For the tooling to work, it has to get access to https://github.com/SoftwareUnderstanding/software_types/blob/main/software-types.jsonld, but how does the tool know it's there from the url https://w3id.org/software-types? The https://json-ld.org/playground/ tool does not seem to pick up the types defined by https://w3id.org/software-types. I do get the json-ld definitions if I enter the following command:

curl -sH "accept:application/ld+json" -L https://w3id.org/software-types

So it is available, but the https://json-ld.org/playground/ tool does not know that it can access it like that?

EDIT: I hacked the JSON-LD to pass the validation check on https://json-ld.org/playground/ by manually changing https://w3id.org/software-types into https://softwareunderstanding.github.io/software_types/release/1.0.0/software-types.jsonld, as it seems to need to be able to download the jsonld file:

{"@context": [
    "http://schema.org",
    "https://softwareunderstanding.github.io/software_types/release/1.0.0/software-types.jsonld"
  ],
  "@type": "SoftwareLibrary",
  "description": "CadLib is a 100% .NET library to Read, write, view DWG/DXF files in C# VB .NET and convert to XAML, Pdf, PS, jpg, png.",
  "name": "",
  "downloadUrl": "https://www.woutware.com/download/CadLib4.0TrialSetup.exe",
  "executableName": "CadLib4.0TrialSetup.exe"
}

How does it know how to get the .jsonld file from schema.org http://schema.org though? Seems there is a something going on behind the scenes.

dgarijo commented 1 year ago

@woutware, The application does it through content negotiation. I changed the content negotiation rule so the JSON-LD serialization is served to browsers too. Originally I had set up the curl command as you used it, but if a browser did the request it returned HTML. In any case, now it works. I pasted:

{"@context": [
    "http://schema.org",
    "https://w3id.org/software-types"
  ],
  "@type": "SoftwareLibrary",
  "name": "CadLib",
  "description": "CadLib is a 100% .NET library to Read, write, view DWG/DXF files in C# VB .NET and convert to XAML, Pdf, PS, jpg, png.",
  "downloadUrl": "https://www.woutware.com/download/CadLib4.0TrialSetup.exe",
  "executableName": "CadLib4.0TrialSetup.exe"
}

And I get the desired descriptions. Now you can play around with it

woutware commented 1 year ago

@dgarijo Nice, I'm slowly getting understanding for how this all works, thank you very much for fixing the problem.

And if you want to do the same thing in RDFa, suppose I have this snippet:

<div vocab="https://w3id.org/software-types/" typeof="SoftwareLibrary">
    <!-- this property comes from schema.org. -->
    <meta  property="name" content="CadLib for .NET Framework 4.x" />

    <!-- this property comes from https://w3id.org/software-types. -->
    <meta  property="executableName" content="CadLib4.0TrialSetup.exe" />
<div>

and I run it through https://validator.schema.org/, it looks like it interprets all properties as part of https://w3id.org/software-types, and for the name property this is not correct, because it comes from schema.org. So is the validator not smart enough to get the type definitions from https://w3id.org/software-types and figure out the name property comes from the parent class? Or do you have to explicitly specify in RDFa which properties come from which type?

EDIT 1: I discovered that you can specify multiple types in typeof, see here: https://www.w3.org/community/schemabibex/wiki/Using_Multiple_Types. The only trouble is that in this case the types are in 2 vocabularies. So I tried the following:

<div vocab="http://schema.org/" typeof="SoftwareApplication https://w3id.org/software-types#SoftwareLibrary">
    <meta  property="name" content="CadLib for .NET Framework 4.x" />
    <meta  property="https://w3id.org/software-types/executableName" content="CadLib4.0TrialSetup.exe" />
<div>

And this validates OK in https://validator.schema.org. However, the validator does not actually retrieve the type definitions, because I can just replace https://w3id.org/software-types by a fake url, and the validator will also say it's OK:

<div vocab="http://schema.org/" typeof="SoftwareApplication https://fakeurl.org#SoftwareLibrary">
    <meta  property="name" content="CadLib for .NET Framework 4.x" />
    <meta  property="https://fakeurl.org/executableName" content="CadLib4.0TrialSetup.exe" />
<div>

So I can basically write anything. This make me wonder, do any tools actually pick up these RDFa definitions, like the Google search bot?

dgarijo commented 1 year ago

Hello, the Google Search bot will likely pick only Schema.org terms, because that's what they are interested in (hence the importance of merging this proposal in codemeta and later in Schema.org). For now you can use it to expose the information for your own purposes, which is what we do in our implementations.

Apache any to triples is a tool that can be used to retrieve rdf-a and micro annotations from pages.

woutware commented 1 year ago

Hi @dgarijo,

Thank you for elaborating, that's what I figured. The Any23 tool looks super useful, thank you for this recommendation.

I will use your schema to encourage use of it, looks like a great initiative.

I also have a few ideas for improvement that you can consider. Another very popular software type is "plugins", it could potentially be quite powerful to define plugin X for target application Y. That way you could easily query the world for plugins that are compatible with application Y.

And another idea is to maybe have some sort of "softwareplatform" property and "softwareplatformversion". Typical example of this is the java platform, the .NET platform, or javascript. You could also name it "ecosystem". Many software engineers live in 1 main ecosystem (like my personal ecosystem is .NET), so this would also be a very powerful way to cut an information slice out of the information cube.

Again thank you for all your help!

dgarijo commented 1 year ago

I like plugin, we can include it in the types @proycon

Platform is a little more generic. Isn't that the programing language? like .NET or javascript. That is already captured by schema.org. and may not be needed.

Thanks for your input!

woutware commented 1 year ago

It's similar to programming language, but .NET is not a language. For example C# is a language, that compiles to .NET intermediate language. And VB.NET is another language that compiles to .NET. So for a .NET programmer, he can use components from other vendors that are also .NET, so it's a sort of walled garden.

There are also things like webassembly, which is also a type of intermediate language. And will likely have many languages that can compile to webassembly.

The common factor is that these have an execution engine that interprets the intermediate language (Java also has its own execution engine.). It's like x86 machine code, but on a higher layer above the hardware machine code, so it runs on multiple machine code architectures (x86, x64, ARM, etc).

So I guess the distinction would be the type of binary code:

So a programmer is typically working in one of these columns, and can only use software that was built for the same cpu/engine. And as such it's handy to be able to search for all available software that is compatible with .NET or java for example.

Hope that makes sense!

proycon commented 1 year ago

On Fri Aug 25, 2023 at 2:57 PM CEST, woutware wrote:

And another idea is to maybe have some sort of "softwareplatform" property and "softwareplatformversion". Typical example of this is the java platform, the .NET platform, or javascript. You could also name it "ecosystem". Many software engineers live in 1 main ecosystem (like my personal ecosystem is .NET), so this would also be a very powerful way to cut an information slice out of the information cube.

This already exist in schema.org (distinct from programmingLanguage): https://schema.org/runtimePlatform