Added two new classes, classes.features.StructuredFeature and classes.features.StructuredFeatureSet.
A StructuredFeature represents articulated tokenized content associated with a single document. This might be the full-text content of a paper, where each work is a token. The StructuredFeature supports the definition of contexts, which divide the tokens up into chunks. For example, a document might be divided into pages, paragraphs, and/or sentences.
A StructuredFeatureSet is similar to the existing FeatureSet class, except that is designed specifically to support StructuredFeature instances. Note especially the context_chunks method, which generates a sparse representation of the entire featureset divided according to the selected context.
Added two new classes,
classes.features.StructuredFeature
andclasses.features.StructuredFeatureSet
.A
StructuredFeature
represents articulated tokenized content associated with a single document. This might be the full-text content of a paper, where each work is a token. TheStructuredFeature
supports the definition ofcontexts
, which divide the tokens up into chunks. For example, a document might be divided into pages, paragraphs, and/or sentences.A
StructuredFeatureSet
is similar to the existingFeatureSet
class, except that is designed specifically to supportStructuredFeature
instances. Note especially thecontext_chunks
method, which generates a sparse representation of the entire featureset divided according to the selected context.