Open Mec-iS opened 2 years ago
Wonderful! This is super helpful. The nearest neighbor parts would have some immediate use cases.
BTW, there's already the SubgraphMatrix
class in subg.py
which handles the transform/inverse_transform from an RDF graph to:
we probably want some methods that returns numpy.array
, I will reuse what it is already there for sure.
@SultanOrazbayev mentioned the importance of having a descriptive summary of general metrics about a graph, something like pandas.describe()
. These are the metrics that could be useful in an hypothetical SubgraphMatrix.describe()
:
Agreed, sometimes it's hard to actually understand what kind of graph you're using..
One of the integration we are going to work on is the one with
scikit-learn
.This conversation is to collect requirements and features to implement calling
scikit-learn
usingkglab
abstraction layer.My point of view after taking a look to the API provided by popular data science libraries, these are the interesting
scikit-learn
andscipy
functionalities that we could start with:KnowledgeGraph
data structures to observations matrix (to be defined), adjacency matrix and condensed distance matrix as defined byscipy
. This will allow building up further flows (or "pipelines", chains of function calls) that the users can assemble to go from aKnowledgeGraph
representation to a graph algebra representations. This is critical as we need to pick first principles or to provide different alternatives according to the type of graph or the different tasks the users may want to accomplish.Other possible examples:
These are now in unordered fashion, will take some time to figure out which principles to import from
scikit-learn
andscipy
so to build up proper user flows from knowledge graph as represented in RDF/kglab and graph algebra representations.Please provide feedback and suggestions. I will create a Github project around this effort.
cc: @tomaarsen @SultanOrazbayev