Open trugwaldsaenger opened 6 years ago
At the team meeting we yesterday decided to think about how we actually would include this data in our data model and then ask repository providers to submit their statistics according to the structure we come up with.
Taking and updating the information from the etherpad on usefull information to collect:
There are different options for implementing this in our data structure.
statistics
object where we put all the count information.In this comment, I am focussing on 2.) although 1.) might be the better way to go.
For the number of resources provided by a service, we could just add a property resourceCount
or similar to a Service.
For counting things we already have in our data (i.e. topics and licenses) one option is to add a level of indirection to about
, license
statements where we add the count numbers. E.g. for licensing, we could adjust the current licensing information like follows but we will have to use our own licensing
property :
{
"licensing":[
{
"count":"23",
"license":{
"image":"https://upload.wikimedia.org/wikipedia/commons/thumb/b/b0/Copyright.svg/197px-Copyright.svg.png",
"@type":"Concept",
"name":[
{
"@value":"Copyright",
"@language":"en"
},
{
"@value":"Copyright",
"@language":"de"
}
],
"@id":"https://oerworldmap.org/assets/json/licenses.json#copyright"
}
},
{
"count":"50",
"license":{
"image":"http://mirrors.creativecommons.org/presskit/buttons/88x31/png/by-sa.png",
"@type":"Concept",
"name":[
{
"@value":"Creative Commons Attribution Share-Alike",
"@language":"en"
},
{
"@value":"Creative Commons Namensnennung, Weitergabe unter gleichen Bedingungen",
"@language":"de"
}
],
"@id":"https://oerworldmap.org/assets/json/licenses.json#cc-by-sa"
}
}
]
}
For subjects/about the solution would be similar to the above.
As we don't have the other information yet (formats, # of users, modifications), we are totally free on how to proceed.
In this comment, I am focussing on 2.) although 1.) might be the better way to go.
@acka47: How do we find out, which way is the better one?
For counting things we already have in our data (i.e. topics and licenses) one option is to add a level of indirection to about, license statements where we add the count numbers.
@acka47 I was just wondering if the approach in https://github.com/hbz/oerworldmap/issues/941#issuecomment-267125691 could also be applied here:
{
"license": [
{
"@type": "Role",
"roleName": "License usage",
"count": "23",
"license": {
"@id": "https://oerworldmap.org/assets/json/licenses.json#copyright"
}
},
{
"@type": "Role",
"roleName": "License usage",
"count": "50",
"license": {
"@id": "https://oerworldmap.org/assets/json/licenses.json#cc-by-sa"
}
}
]
}
@literarymachine I would really like it if schema.org supported something like this but 1.) count
is not in schema.org and 2.) Role
is not used like this AFAIK. Although its general description covers it ("Represents additional information about a relationship or property.") the examples are all cases where there are relations between at least on agent (person or organization) and aother entity...
We might do use http://schema.org/AggregateOffer with offerCount here, though...
There are different options for implementing this in our data structure.
Add a statistics object where we put all the count information.
If we do this, we would either have to keep in mind that e.g. licence information comes from two places now: (imported) statistics and manual entries. I think there should only be one source of truth. So if we add the statistics
object, I think we should get rid of the license
and about
fields. Which sort of makes it the same problem: how to model relation counts.
Related to this issue, I made a proposal for publishing repository information, see the email (German) at https://lists.dnb.de/pipermail/dini-ag-kim-oer/2018-August/000067.html.
Initiated by @philboeselager`s participation in the last jointly/edusharing workshop we discussed the integration of statistics of individual repositories into the world map.
As far as I understood there are two quite different approaches to do this:
1) Display statistical data without importing it into the search index
As soon as data about the size of the repository (How many documents are included?), about the used licenses (How many documents are CC BY licensed? How many use other licenses?) as well as the subject (How many resources are available in the field of medicine?) is provided by the API of an repository, it should be quite easy to display it in the form of diagrams within a service profile. Anne Zobel provided a screenshot, which shows how this could look like:
I think that integrating data like this provides a certain value:
I like the way it is implemented in the screenshot (showing only one statistic, which can be switched) and I think it should fit fine into the new full page profile layout into the right column.
2) Import statistical data into OER World Map index
The second solution would be more complex. The idea here would be to import the data and include it into our data model. This should allow us to make things as following:
We even might be able to aggregate this data in the future and show things like "number of OER`s in Germany", "increase of OER production in India last year" etc. For sure this is still a long way...
If we would have a standard for this data future could ideally look like this: A new repository connects via "automatic handshake" with the world map and then imports all relevant data automatially to the map.
The "import solution" for sure is more difficult to implement. Challenges are:
So how to go on?
If it is true, that solution 1 ("display solution") is easy to implement, I think we should include it as an example for edusharing and maybe some other repositories. Additionally we should analyse solution 2 ("import solution") deeper, so that we see the challenges here better.
@acka47 @literarymachine : What is your opinion?