utk-se / CodeAnalytics-analyzer

Single-shot code analysis for a code repo
1 stars 0 forks source link

Aiden/rewrite #40

Closed argvrutter closed 4 years ago

argvrutter commented 4 years ago

CodeAnalytics

Extends the functionality of pandas for analyzing code repositories.

Features

Installation

pip install caanalyzer

Example Usage

See test.ipynb under examples for example usage.

# Import the class as well as provided
from caanalyzer import CodeRepo
from caanalyzer.tokens import MethodTokenizer, FileTokenizer, LineTokenizer, Tokenizer
from caanalyzer.metrics import width, height, num_tokens

repo = CodeRepo('path')

repo.index([FileTokenizer, LineTokenizer, MethodTokenizer], {'size' : len, 'width': width, 'height' : height, 'num_tokens': num_tokens})

# get statistics about each scope
repo.df.groupby('token_type').mean()

# get only info about python files
repo.df.xs(('.py'), level=('lang'))

# save
repo.df.to_hdf('output/clang_index.h5', key='df', mode='w')