Open KaboChow opened 3 months ago
What is the expected behaviour here? If the shape was painted with the even-odd rule in the PDF, then evenodd
will be set on all of its subpaths. This seems reasonable, no? If you want to know what regions are filled then you have to apply the rule.
(It looks you are actually using pdfplumber, not pdfminer.six directly, but the evenodd
attribute is coming directly from pdfminer.six)
On further investigation it appears that this is related to https://github.com/jsvine/pdfplumber/issues/1057, which is related to #861 and #963. I'm still not quite sure what the expected behaviour should be, though.
I think the issue is that you have one path (the porous shape above) with a lot of subpaths, which has been drawn with the f*
, b*
or B*
operator, and that pdfminer.six has split this path into a bunch of separate LTCurve shapes, which makes it impossible for you to know which ones are filled and which ones are not?
The problem here wouldn't be evenodd
as that attribute only refers to whether the even-odd rule is applied to fill the shape. I think you want to know which of the LTCurve shapes are filled and which ones aren't? In this case the expected behaviour would be for pdfminer.six to set the fill
attribute on those shapes.
Is this correct?
Hello @dhdaines, your point is correct, I fell into a misunderstanding before, the "evenodd" property can only be used to distinguish between odd and non-zero wrap rules, and in fact cannot tell whether the LTCurve shape is a hole or not. The porous shape in the example is actually a full path, but it's split into multiple LTCurve shapes for rectangular detection, which I guess is what caused the problem. As a solution to this problem, I cleared the rule of splitting LTCurve shapes, and while it doesn't seem like a good idea, the lack of rectangular detection doesn't affect me much and the problem that bothered me is solved
The porous shape in the example is actually a full path, but it's split into multiple LTCurve shapes for rectangular detection, which I guess is what caused the problem.
Thanks! That's kind of what I thought - your misunderstanding of evenodd
is perfectly understandable, in fact it isn't useful at all when the shapes are split since there's no way to apply the fill rule. So this should still be considered a bug. My thinking on this would be either:
evenodd
attribute, letting the user apply the rule (non-zero winding or even-odd) to determine the filled areas.pdfminer.six
, setting the fill
attribute on the filled subpaths. Possibly remove the evenodd
attribute since it is meaningless without knowing all the subpaths.pdfminer.layout
API to include the concept of complex paths, or somehow expose the fact that an LTCurve is part of a larger path. Again, the user will then have to apply the fill rule.I think @jsvine might need to weigh in on this since I think he contributed the code in question?
Probably the simplest to implement would be (1) or (3).
Hello everyone!I found a problem regarding the 'evenodd' value of the object When I try to get the data of this porous shape, the 'evenodd' values obtained are all true This is the PDF I used for testing: Spin-City-Letters-6fae9bb1b9a6b3dd0f5811b066e9ed8e (1).pdf When I use letters or numbers to convert shapes, the data recognized is correct, and the value of 'evenodd' is false. But when using a custom shape, the recognized values of 'evenodd' are all true. Can anyone solve this problem?Thanks!