Closed xrotwang closed 6 months ago
One of the questions to answer here is whether the genericity of the proposal here over e.g. a more targeted "ColexificationTable" is useful. Basically: Is it more likely people end up shoehorning other networks into a ColexificationTable or more likely that colexifications in a ParameterNetworkTable are not recognizable enough?
There are other projects on co-expression, and they would probably be better placed into a parameternetworktable. So if propagated well enough, this should work perfectly with the proposed level of abstraction.
We might call it "ParameterGraph", though?
We might call it "ParameterGraph", though?
Not sure. Graph seems to be used more for the "pure" data structure while network denotes the "object to be analysed"? If so, I'd prefer network because CLDF is all about "connecting data with analysis".
Great idea. Lots of data could be handled this way.
I think generic is better than specific, as the specifics can be dealt with at the application level.
Little bit worried about saving in a particular file format, but we do this for trees anyway (nexus/newick). However, could we just store this type of data as an extended ValueTable e.g.
node1,node2,attribute,value
In fact, that is the idea, @SimonGreenhill, if you compare what I did for CLICS4: https://github.com/clics/clics4, just imagine that we make a proper parametergraph out of the current structuredataset. GML is just an addon, no requirement.
See related discussion here: https://github.com/concepticon/concepticon-data/issues/1338
In terms of naming, I'd stick with ParameterNetwork
, because Graph
might sound too exclusive - given that we also accept directed graphs, mixed graphs, etc.
In the Concepticon use case, we'd also have multiple networks in the ParameterNetwork component: Different ones for each Concepticon concept relation, some derived from networks in concept lists. These would probably be disambiguated by a contributionReference.
Yes, that seems important.
While languages are often arranged in trees - a datatype with Newick as quasi default format, parameters (in particular the concepts of Wordlists) are often arranged in (semantic) networks for further analysis (e.g. in CLICS). (For StructureDatasets, such a network or graph could be useful to describe dependencies (or hierarchies) among features.)
Thus, CLDF could add a
ParameterNetwork
component, basically a table listing edges with - two parameter reference columnssource
andtarget
, specifying the nodesdirected
Media_Type
oftext/xml
, and the namespace declaration in the XML, whereas e.g. DOT files do not have a clear cut discovery mechanism (see https://forum.graphviz.org/t/mime-type-for-downloading-dot-files/877))Examples: