diging / tethne

Python module for bibliographic network analysis.
http://diging.github.io/tethne/
GNU General Public License v3.0
81 stars 32 forks source link

Added StreamingCorpus, support in readers.wos #142

Closed erickpeirson closed 8 years ago

erickpeirson commented 8 years ago

This is the first pass at a "streaming" Corpus -- records are cached on disk (via e.g. cPickle serializer), and loaded as needed. Records are indexed as they are added, rather than at the very end.

Some attendant tweaks to other classes to improve memory performance, etc.

Support for the StreamingCorpus is now available in tethne.readers.wos.read -- this should be extended to the other readers (and tested) before v0.8.