HK3-Lab-Team / pytrousse

PyTrousse collects into one toolbox a set of data wrangling procedures tailored for composing reproducible analytics pipelines.
Apache License 2.0
0 stars 1 forks source link

Dataset property: columns types #45

Closed alessiamarcolini closed 4 years ago

alessiamarcolini commented 4 years ago
lorenz-gorini commented 4 years ago

pd.api.types.infer_dtype()

We may use this pandas function to find the column types. It uses multiple layers for recognition:

  1. dtypes of the Series (if they are pandas special types only. See https://pandas.pydata.org/pandas-docs/stable/user_guide/basics.html#dtypes),
  2. dtype of the numpy array
  3. If 2. returned a np.dtype = object, then further checks are performed by looking at the type of each sample. Possible types are: DateTime, TimeDelta, Integer,...

What we need to implement

Categorical Data