cosimameyer / overviewpy

💡 Easily Extracting Information About Your Data in Python
https://cosimameyer.github.io/overviewpy
BSD 3-Clause "New" or "Revised" License
1 stars 1 forks source link
package-development python

overviewpy

PyPI experimental CI/CD

overviewpy aims to make it easy to get an overview of a data set by displaying relevant sample information.

Installation

$ pip install overviewpy

Usage

Implemented Functions

The goal of overviewpy is to make it easy to get an overview of a data set by displaying relevant sample information. At the moment, there are the following functions:

overview_tab

Generate some general overview of the data set using the time and scope conditions with overview_tab. The resulting data frame collapses the time condition for each id by taking into account potential gaps in the time frame.

from overviewpy.overviewpy import overview_tab
import pandas as pd

data = {
       'id': ['RWA', 'RWA', 'RWA', 'GAB', 'GAB', 'FRA', \
        'FRA', 'BEL', 'BEL', 'ARG'],
       'year': [2022, 2023, 2021, 2023, 2020, 2019, 2015, \
        2014, 2013, 2002]
   }

df = pd.DataFrame(data)

df_overview = overview_tab(df=df, id='id', time='year')

overview_na

overview_na is a simple function that provides information about the content of all variables in your data, not only the time and scope conditions. It returns a horizontal ggplot bar plot that indicates the amount of missing data (NAs) for each variable (on the y-axis). You can choose whether to display the relative amount of NAs for each variable in percentage (the default) or the total number of NAs.

from overviewpy.overviewpy import overview_na
import pandas as pd
import numpy as np

data_na = {
        'id': ['RWA', 'RWA', 'RWA', np.nan, 'GAB', 'GAB',\
            'FRA', 'FRA', 'BEL', 'BEL', 'ARG', np.nan,  np.nan],
        'year': [2022, 2001, 2000, 2023, 2021, 2023, 2020, \
            2019,  np.nan, 2015, 2014, 2013, 2002]
    }

df_na = pd.DataFrame(data_na)

overview_na(df_na)

Roadmap

overviewpy seeks to mirror the functionality of overviewR and will extend its features with the following functionality in the future:

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

License

overviewpy is licensed under the terms of the BSD 3-Clause license.

Credits

overviewpy was created with cookiecutter and the py-pkgs-cookiecutter template.