This is the first pass at a "streaming" Corpus -- records are cached on disk (via e.g. cPickle serializer), and loaded as needed. Records are indexed as they are added, rather than at the very end.
Some attendant tweaks to other classes to improve memory performance, etc.
Support for the StreamingCorpus is now available in tethne.readers.wos.read -- this should be extended to the other readers (and tested) before v0.8.
This is the first pass at a "streaming" Corpus -- records are cached on disk (via e.g. cPickle serializer), and loaded as needed. Records are indexed as they are added, rather than at the very end.
Some attendant tweaks to other classes to improve memory performance, etc.
Support for the StreamingCorpus is now available in
tethne.readers.wos.read
-- this should be extended to the other readers (and tested) before v0.8.