schemaorg / suggestions-questions-brainstorming

Suggestions, questions, and brainstorming
19 stars 15 forks source link

size characteristics of a Dataset #160

Open VladimirAlexiev opened 6 years ago

VladimirAlexiev commented 6 years ago

schemaorg/schemaorg#1083 and schemaorg/schemaorg#1471 has http://pending.schema.org/variableMeasured, which describes what measurements/observations are included in a dataset.

(Shameless cc to all people in those discussions: @danbri @darobin @natashafn @akuckartz @joshsh @Aaranged @ypriverol @agbeltran @dr-shorthair @ldodds @rob-metalinkage @KerryLea; and @RichardWallis)

A question about variableMeasured: since a dataset would have many observations, can you confirm that the PropertyValue pointed wouldn't have any value and could only have minValue...maxValue describing the range of included observations?

Now for the main question: what properties do we have to describe the number of observations or other size characteristics of a dataset itself? Here are some cases:

I realize schema:Dataset is about any datasets not just RDF. But void:entities pertains to any dataset, and the idea to be able to describe in a structured way the characteristics of things inside a dataset (VOID's partition subsets) is very powerful.

How could we include that in Schema?

VladimirAlexiev commented 6 years ago

StatDCAT-AP has some props for describing what is inside. I don't know much about it, but found them in a mapping to Schema proposed by the EC:

stat:attribute, dimension, numberOfDataSeries, unitOfMeasurement

chrisgorgo commented 5 years ago

I'm also interested in expressing the "number of observations" or "sample size" in schema.org. Users looking for data are interested in this piece of metadata.

What is a sample of observation differs from one dataset to another so I would suggest to disentangle this and allow to specify the number and definition of an observation separately.

RichardWallis commented 4 years ago

See issue #7 for the context of the move from the main Schema.org issue tracker to this repository.