Summary
Remove unstructured.partition.html.convert_and_partition_html(). Move file-type conversion (to HTML) responsibility to each brokering partitioner that uses that strategy and let them call partition_html() for themselves with the result.
Additional Context
Rationale:
partition_html() does not want or need to know which partitioners might broker partitioning to it.
Different brokering partitioners have their own methods to convert their format to HTML and quirks that may be involved for their format. Avoid coupling them so they can evolve independently.
The core of the conversion work is already encapsulated in unstructured.partition.common.convert_file_to_html_text_using_pandoc().
convert_and_partition_html() represents an additional brokering layer with the entailed complexities of an additional site for default parameter values to be (mis-)applied and/or dropped and is an additional location for new parameters to be added.
Summary Remove
unstructured.partition.html.convert_and_partition_html()
. Move file-type conversion (to HTML) responsibility to each brokering partitioner that uses that strategy and let them callpartition_html()
for themselves with the result.Additional Context
Rationale:
partition_html()
does not want or need to know which partitioners might broker partitioning to it.unstructured.partition.common.convert_file_to_html_text_using_pandoc()
.convert_and_partition_html()
represents an additional brokering layer with the entailed complexities of an additional site for default parameter values to be (mis-)applied and/or dropped and is an additional location for new parameters to be added.