align
is a Python library for extracting quantitative, reproducible
metrics of multi-level alignment between two speakers in naturalistic
language corpora. The method was introduced in "ALIGN: Analyzing
Linguistic Interactions with Generalizable techNiques" (Duran, Paxton, &
Fusaroli, 2019; Psychological Methods).
Examples of papers relying on the ALIGN library:
align
may be downloaded directly using pip
.
To download the stable version released on PyPI:
pip install align
Or to update:
pip install align --upgrade
And it's always good practice to install a package like
align
, which has several dependencies (seerequirements.txt
), in a virtual environment.Anaconda users: The above should work in the vast majority of cases. However, if you prefer an easy way to install
align
within a virtual environment in one go, or you are experiencing problems with trying to updatealign
, a YAML file has been provided to streamline things. Just follow these simple steps:
- Download the
environment.yml
file and navigate to the folder where it has been downloaded- Run the following command in Terminal:
conda env create -f environment.yml
- Be sure to activate the new enviroment (i.e.,
conda activate align0.1.1
) before running anyalign
analyses (such as the tutorials; see below)
If you experience any problems, please put them in the "Issues" section of this repository.
ALIGN consists of two primary modules for conducting analyses, prepare_transcripts
and calculate_alignment
. To get a quick glance of the functions contained within each module, please check out the following:
prepare_transcripts
: https://nickduran.github.io/align-linguistic-alignment/prepare_transcripts.html
calculate_alignment
: https://nickduran.github.io/align-linguistic-alignment/calculate_alignment.html
align
optionsThe Google News pre-trained word2vec vectors (GoogleNews-vectors-negative300.bin
)
and the Stanford part-of-speech tagger (stanford-postagger-full-2020-11-17
)
are required for some optional align
parameters but must be downloaded
separately. Please see the tutorials for more information.
Google News: https://code.google.com/archive/p/word2vec/ (page) or https://drive.google.com/file/d/0B7XkCwpI5KDYNlNUTTlSS21pQmM/edit?usp=sharing (direct download)
Stanford POS tagger: https://nlp.stanford.edu/software/tagger.shtml#Download (page) or https://nlp.stanford.edu/software/stanford-tagger-4.2.0.zip (direct download)
We created Jupyter Notebook tutorials to provide an easily accessible
step-by-step walkthrough on how to use align
. Below are descriptions of the
current tutorials that can be found in the examples
directory within this
repository. If unfamiliar with Jupyter Notebooks, instructions for installing
and running can be found here: http://jupyter.org/install. We recommend installing
Jupyter using Anaconda. Anaconda is a widely-used Python data science platform
that helps streamline workflows.
Jupyter Notebook 1: CHILDES
Jupyter Notebook 2: Devil's Advocate
We are in the process of adding more tutorials and would welcome additional tutorials by interested contributors.
If you find the package useful, please cite our manuscript:
Duran, N., Paxton, A., & Fusaroli, R. (2019). ALIGN: Analyzing Linguistic Interactions with Generalizable techNiques. Psychological Methods. http://dynamicog.org/papers/
CHILDES
Kuczaj, S. (1977). The acquisition of regular and irregular past tense forms. Journal of Verbal Learning and Verbal Behavior, 16, 589–600.
Devil's Advocate
Duran, Nicholas, Alexandra Paxton, and Riccardo Fusaroli. Conversational Transcripts of Truthful and Deceptive Speech Involving Controversial Topics, Central California, 2012. ICPSR37124-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2018-08-29.