ContentMine / pyCProject

Provides basic function to read a ContentMine CProject and CTrees into python datastructures.
MIT License
3 stars 1 forks source link

pyCProject

Provides basic function to read a ContentMine CProject and CTrees into python datastructures.

Main use is to read in all results.xml created by ami, and to be relate them to papers/metadata.

DEPRECATION WARNING

Visualization / network analysis in factnet.py will is no longer supported or maintained in this package as it is not part of the core functionality, and will be removed.

Installation

pip install pycproject
source activate YOURVIRTUALENV
pip install pycproject
python setup.py build
python setup.py install

converting to json-dumps

If your cproject is in PATH/TO/CPROJECT/CPROJECTNAME, call the script with

python3 pycproject/convert2elasticdump.py --raw PATH/TO/CPROJECT --name CPROJECTNAME --output PATH/TO/OUTPUTFOLDER

Usage

You can then read a generated ContentMine-project in with

from pycproject.readctree import CProject
MYPROJECT = CProject("path_to_cproject", "cproject_name")

You can work with a pandas DataFrame after creating it with

df = MYPROJECT.get_dataframe()