Closed rsanaie closed 2 months ago
Sure, this makes sense to me - at the moment these methods would be annoyingly complex to re-implement in user code, but don't really provide any configurability for filtering content. We still have some TODOs to update the old getLinesByLayoutArea
, getFooterLines
and getHeaderLines
heuristic methods to play nicely with Layout where available, also...
I'm thinking something similar to the (recently-introduced) IBlockTypeFilterOpts, like below?:
page.html({
skipBlockTypes: [ApiBlockType.LayoutHeader, ApiBlockType.LayoutFooter],
});
Would block type be sufficient for your use-case? Or do you think you'd need to be able to pull out individual instances with e.g. skipBlockIds
as well?
IBlockTypeFilterOpts work, I don't need to pick out specific IDs
OK so the good news is I've been able to get a scrappy v0.4.2-alpha.1 pre-release out already where the above should work...
...But the bad news is there's probably a fair bit more to figure out & harden before it could go to mainline release. Today the filter options on html()
only work properly with Layout*
block types, and only the Layout*
items (plus Page and TextractDocument) support passing the options in. I'd like to make a more general extension to enable full IBlockTypeFilterOpts
across all IRenderable
s, but that'll probably take a while to work through.
If you manage to try out the alpha and have any feedback though, it'd be great to hear! Maybe it can enable your use-case in the short term at least
Hi @rsanaie - I just pushed v0.4.2-alpha.3, which I think should work functionally pretty much the same as the last one but with less ugliness under-the-hood.
Any chance you'd have some time to try it out and double-check it doesn't break anything before we go ahead and push to a mainline release?
When using the .html() function, certain blocks aren't necessary and should be skipped, such as page number, header/footer. Is there a way we can specify skipping these blocks?
Thanks