uci-ml-repo / ucimlrepo

Python package for dataset imports from UCI ML Repository
MIT License
216 stars 90 forks source link

ucimlrepo package

Package to easily import datasets from the UC Irvine Machine Learning Repository into scripts and notebooks.
Current Version: 0.0.7

Installation

In a Jupyter notebook, install with the command

!pip3 install -U ucimlrepo 

Restart the kernel and import the module ucimlrepo.

Example Usage

from ucimlrepo import fetch_ucirepo, list_available_datasets

# check which datasets can be imported
list_available_datasets()

# import dataset
heart_disease = fetch_ucirepo(id=45)
# alternatively: fetch_ucirepo(name='Heart Disease')

# access data
X = heart_disease.data.features
y = heart_disease.data.targets
# train model e.g. sklearn.linear_model.LinearRegression().fit(X, y)

# access metadata
print(heart_disease.metadata.uci_id)
print(heart_disease.metadata.num_instances)
print(heart_disease.metadata.additional_info.summary)

# access variable info in tabular format
print(heart_disease.variables)

fetch_ucirepo

Loads a dataset from the UCI ML Repository, including the dataframes and metadata information.

Parameters

Provide either a dataset ID or name as keyword (named) arguments. Cannot accept both.

Returns

list_available_datasets

Prints a list of datasets that can be imported via fetch_ucirepo

Parameters

Metadata

Links