allenai / mmda

multimodal document analysis
Apache License 2.0
158 stars 18 forks source link

Kylel/2022 09/span group utils #135

Closed kyleclo closed 2 years ago

kyleclo commented 2 years ago

minor PR: added tests for a pretty important functionality that was being used -- how to combine Spans that are next to each other into a single big Span. the key thing that was undocumented was that Boxes for the underlying Spans actually disappear after this merging functionality, which the tests now capture.

dont think this is intended behavior we want to support in future, but for now, this is how this utility is being used

geli-gel commented 2 years ago

@cmwilhelm It looks like we don't lose original bbox information because it's still there in the SpanGroup's BoxGroup

kyleclo commented 2 years ago

@geli-gel 's right. the only place this is lost is in the Span's Box, not in the SpanGroup's BoxGroup.

i dont think this is a bug; it's just poor/opaque design. ive created an issue to visit this later: https://github.com/allenai/mmda/issues/137