digitraceslab / niimpy

Python module for analysis of behavorial data
https://digitraceslab.github.io/niimpy/
MIT License
12 stars 7 forks source link

Niimpy

maintenance-status Test Build Test installation from source codecov License: MIT

What

Niimpy is a Python package for analyzing and quantifying behavioral data. It uses pandas to read data from disk, perform basic manipulations, provides explorative data analysis functions, offers many high-level preprocessing functions for various types of data, and has functions for behavioral data analysis.

For Who

Niimpy is intended for researchers and data scientists analyzing digital digital behavioral data. Its purpose is to facilitate data analysis by providing a standardized replicable workflow.

Why

Digital behavioral studies using personal digital devices typically produce rich multi-sensor longitudinal datasets of mixed data types. Analyzing such data requires multidisciplinary expertise and software designed for the purpose. Currently, no standardized workflow or tools exist to analyze such data sets. The analysis requires domain knowledge in multiple fields and programming expertise. Niimpy package is specifically designed to analyze longitudinal, multimodal behavioral data. Niimpy is a user-friendly open-source package that can be easily expanded and adapted to specific research requirements. The toolbox facilitates the analysis phase by providing tools for data management, preprocessing, feature extraction, and visualization. The more advanced analysis methods will be incorporated into the toolbox in the future.

How

The toolbox is divided into four layers by functionality: 1) reading, 2) preprocessing, 3) exploration, and 4) analysis. For more information about the layers, refer the toolbox architecture chapter :doc:architecture. Quickstart guide would be a good place to start :doc:quick-start. More detailed demo Jupyter notebooks are provided in user guide chapter :doc:demo_notebooks/Exploration. Instructions for individual functions can be found under API chapter :doc:api/niimpy.

Installation

Getting started with location data

All of the functions for reading, preprocessing, and feature extraction for location data is in location.py. Currently implemented features are:

Usage:

import pandas as pd
import niimpy
import niimpy.location as nilo

CONTROL_PATH = "PATH/TO/CONTROL/DATA"
PATIENT_PATH = "PATH/TO/PATIENT/DATA"

# Read data of control and patients from database
location_control = niimpy.read_sqlite(CONTROL_PATH, table='AwareLocation', add_group='control', tz='Europe/Helsinki')
location_patient = niimpy.read_sqlite(PATIENT_PATH, table='AwareLocation', add_group='patient', tz='Europe/Helsinki')

# Concatenate the two dataframes to have one dataframe
location = pd.concat([location_control, location_patient])

# Remove low-quality and outlier locations
location = nilo.filter_location(location)

# Downsample locations (median filter). Bin size is 10 minute.
location = niimpy.util.aggregate(location, freq='10min', method_numerical='median')
location = location.reset_index(0).dropna()

# Feature extraction
features = nilo.extract_features(
  lats=location['double_latitude'],
  lons=location['double_longitude'],
  users=location['user'],
  groups=location['group'],
  times=location.index,
  speeds=location['double_speed']
)

Documentation

Niimpy documentation is hosted at [readthedocs]https://digitraceslab.github.io/niimpy/.

Development

This is a pretty typical Python project with code and documentation as you might expect.

requirements-dev.txt contains some basic dev requirements, which includes a editable dev install of niimpy itself (pip install -e).

Run tests with:

pytest .

Documentation is built with Sphinx:

cd docs
make html
# output in _build/html/

Enable nbdime Jupyter notebook diff and merge via git with:

nbdime config-git --enable

See also