diegomura / react-pdf

📄 Create PDF files using React
https://react-pdf.org
MIT License
14.77k stars 1.17k forks source link

Text layout can be wider than node's width #2182

Open chrissantamaria opened 1 year ago

chrissantamaria commented 1 year ago

Describe the bug In certain contexts, a text node's layout is calculated using a larger width than the node itself, resulting in text overflowing outside its intended container.

To Reproduce react-pdf REPL

const MyDocument = () => (
  <Document>
    <Page style={{ padding: 72 }} debug>
      <View style={{ flexDirection: 'row' }} debug>
        <View style={{ width: 36, height: 36 }} debug />
        <Text debug>
          Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
        </Text>
      </View>
    </Page>
  </Document>
);

ReactPDF.render(<MyDocument />);

This REPL shows a flex row containing a fixed width item (in this case a View) followed by a Text element containing long wrapped text. Though debug shows that the node widths are correctly sized (flex row width of 451, fixed size child width of 33 and text child width of 418), the text lines are broken such that the content visually overflows outside the parent container.

Expected behavior The text content should be no larger than 418 units wide and be visually constricted to its container (that is, not overflowing into the green padding)

Screenshots Screenshot 2023-02-06 at 7 14 55 PM

Investigation (take everything here with a grain of salt - first time digging into this repo or anything Yoga-related)

Upon initially debugging, I found that the text node's lines are being generated in a way that causes this overflow - for example, the first line ending in the word "elit" rather than ending on something more appropriate like "adipiscing". This eventually led me to layoutText: https://github.com/diegomura/react-pdf/blob/089e2d4f4bb7f918c71895960d2639887c1ae6d9/packages/layout/src/text/layoutText.js#L64-L80

Interestingly, this is is being called with a width of 451.280029296875 rather than the expected size of 418 (matching what debug reports as the node's size). This likely explains the lines behavior - textkit is splitting the text assuming is has more width than is actually should.

Looking upstream, layoutText is being called by measureText which is also invoked with the larger-than-expected width. measureText seems to be called by Yoga directly via setMeasureFunc: https://github.com/diegomura/react-pdf/blob/089e2d4f4bb7f918c71895960d2639887c1ae6d9/packages/layout/src/steps/resolveDimensions.js#L140-L145

At this point, I'm a bit lost as to why Yoga would invoke this with the larger width value. Perhaps somewhere in Yoga node creation there needs to be a different width value set?

I also discovered that inspecting the text's yogaNode.getComputedWidth() at a later stage (for example, after calculateLayout has been called for the root page) returns the expected value of ~418 - not sure why this value is different than the one provided by Yoga in the measure stage.

Desktop (please complete the following information):

jeetiss commented 1 year ago

Thanks for digging into this.

I have a suggestion on how to work around this bug 👉🏻 repl

This bug was introduced in v2.2.0 and i tried to figure out why this happened but unsuccessfully

https://github.com/diegomura/react-pdf/issues/1877

chrissantamaria commented 1 year ago

That fix worked, thanks! Wonder why those flex attributes would impact the text node's sizing at that stage - feels like a Yoga issue, but I'd be surprised if a sizing bug like this has persisted on their end for so long. Might try to dig into some of the yogaNode creation logic if I have some time.

Thanks again!

jeetiss commented 1 year ago

I implemented flex-basis auto support and looks like this issue is on fire now. (reproduced in every text component)

chrissantamaria commented 1 year ago

Interestingly, this seems to be an issue in satori as well - opened vercel/satori#393

jeetiss commented 1 year ago

A managed to reproduce this error with yoga nodes only, so looks like this is miss use of the measure function

https://codesandbox.io/s/yoga-setmeasurefunc-misuse-v168ep

muzea commented 1 year ago

repl

const MyDocument = () => (
  <Document>
    <Page style={{ padding: 72 }} debug>
      <View style={{ flexDirection: "row", backgroundColor: "#ffa39e" }}>
        <View style={{ width: 36, height: 36 }} debug />
        <View style={{ width: "100%" }}>
          <Text debug>
            Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
            eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim
            ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
            aliquip ex ea commodo consequat. Duis aute irure dolor in
            reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
            pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
            culpa qui officia deserunt mollit anim id est laborum.
          </Text>
        </View>
      </View>
    </Page>
  </Document>
);

ReactPDF.render(<MyDocument />);

Wrapping a text component with a view and setting a width (even though the width is 100%) to the view worked as I expected.

nicoburns commented 1 year ago

For some reason, yoga calls measureText with full-page width firstly and after the real width is here but lines are already calculated and saved to the node. Any ideas why yoga adds a measureText call?

https://github.com/diegomura/react-pdf/blob/master/packages/layout/src/text/measureText.js#L22-L33

@jeetiss Replying here because the other issue is closed. It's a necessary part of the flexbox algorithm to measure nodes more than once with different size constraints each time. So you can't just layout the text once and then cache the value. You have to actually do the layout each time.

(although I would expect the last time it is called to have the correct final width - that being wrong may well be a yoga bug)

I would suggest:

  1. Computing the width of each word (or unbreakable section if you are line-breaking within words) once and then caching the results.
  2. Have an optimised version of the line-breaking algorithm that computes the overall width and height of the text (without allocating data structures to store which text is on which line) which is run each time measure is called.
  3. Do the final text layout based on Yoga's computed size for the node after Yoga has finished running.

Btw, if you're looking for a Flexbox layout library that avoids Yoga bugs with things like caching the flex-basis and min/max sizes then consider Taffy (disclaimer: I contribute to Taffy), but be aware that Taffy will call your measure function more often than Yoga, not less! There are also a few things we don't support yet like the overflow and direction styles.