Summary
Allow registration of a custom sub-partitioner that extracts images from a DOCX paragraph.
Additional Context
A custom image sub-partitioner must implement the PicturePartitionerT interface defined in this PR. Basically have an .iter_elements() classmethod that takes the paragraph and generates zero or more Image elements from it.
The custom image sub-partitioner must be registered by passing the class to register_picture_partitioner().
The default image sub-partitioner is _NullPicturePartitioner that does nothing.
The registered picture partitioner is called once for each paragraph.
Summary Allow registration of a custom sub-partitioner that extracts images from a DOCX paragraph.
Additional Context
PicturePartitionerT
interface defined in this PR. Basically have an.iter_elements()
classmethod that takes the paragraph and generates zero or moreImage
elements from it.register_picture_partitioner()
._NullPicturePartitioner
that does nothing.