Open ttpro1995 opened 3 years ago
Hey @ttpro1995 - there's a short and a long answer to your question.
Short Answer: Type relations are not defined on the types inheritance hierarchy (all types inherit from VisionsBaseType), rather they can be accessed from the .relations
property. You'll notice I'm using the term relations
rather than children which leads to...
Long Answer: Only nodes in a typeset have actual children. The relations
attribute on a type will return a list of potential parents to the Type. Encoding parent relations on types rather than child relations allows us to compose types together to form typesets (easiest way to see this -> the root of a typeset graph is Generic
, if Generic tracked its children then creating a new type like PositiveInteger
would counterintuitively require source code changes to Generic
; it would effectively produce strong coupling between types).
So, children only really exist on a TypeSet but it's pretty easy to get these as well. I'm going to use the StandardTypeset
as an example but the same will work for any typeset you create / use.
Under the hood visions
uses networkx to build typeset graphs. Each typeset has two graph attributes:
base_graph
which includes non-inferential relations (i.e. excludes Int -> Float because that would require a coercion to the test sequence).relation_graph
which includes all possible types and relations.So in order to get all possible children of a node in a Typeset we just have to use the networkx API and the Typesets relation_graph
.
typeset = StandardTypeset()
test_type = Categorical
child_types = typeset.relation_graph[test_type]
Technically child_types
is going to be a networkx AtlasView object but it supports the in
operation so it will work just fine for your purposes. So your is_child
function would look something like
def is_child(typeset, A, B)
"""Determines if B is a child of A for a given typeset"""
return B in typeset.relation_graph[A]
Technically this will only check a single level deep in the tree (i.e. the children), judging from your example you're actually interested in evaluating all possible descendants of a node which can be similarly achieved by
import networkx as nx
def is_descendant(typeset, A, B)
"""Determines if B is a descendant of A for a given typeset"""
return B in nx.descendants(typeset.relation_graph, A)
EDIT:
It occurred to me you may simply be interested in determining whether your data is Numeric or Categorical - there's an even easier way to do this than checking the parent relations which is just to create a new typeset i.e.
new_typeset = Generic + Numeric + Category
new_typeset.infer_type(df)
If you're interested in making a PR to include some of this functionality by default we would be more than happy to help you get those through! In the meantime, I've marked this as an enhancement request.
Follow the example of "Problem type inference".
From one dataframe, I already make a list of type for each column. Here is the type_list:
type(type_list[0])
givevisions.types.type.VisionsBaseTypeMeta
Now, I want to check if each type either have parent type of Categorical or Numeric.
How should I implement
is_type_parent_of_categorical
?My workaround seem to work because string comparision: