Text is converted to shapes

ma-ku commented 2 months ago

Hi,

this is a general question regarding the handling of text elements; I have implemented SVG rendering using your library and I am very pleased with the results, especially since you are abstracting the output device which is exactly what I need in my project. However, I realized that you are converting all text elements into shapes which is nice for rendering but in my case, I need the text to be rendered as such. Is there a way to intercept the rendering process to accomplish this?

Apart from that, this is a great little library to handle SVGs. Thank you very much.

weisJ commented 2 months ago

Could you please clarify what you mean by

[...] but in my case, I need the text to be rendered as such

Rendering text inherently requires it to be converted into shapes.

ma-ku commented 2 months ago

Sure. I need to generate a custom output format that does not have the typical Graphics2D support and is quite limited. Therefore is was happy to see that your abstraction Output is fairly similar to my generic output interface, so the mapping was extremely simple. However, I need to output text as text elements so that they can be treated as such in the receiving application (e.g. for searching).

Rendering text inherently requires it to be converted into shapes.

I do not agree with that. it's true if you go end render in a bitmap but definitely not if you want to output a PDF or a similar format.

Actually I already spent some time going through the code and I do understand why you prefer the shape driven approach but as explained, it does not work for me. I was considering a secondary interface only for text support that could be treated as a marker interface and could be used instead of the shape output if implemented. That would be something I would try to handle in a fork If I decide to go down that road.

BTW: i am located in Frankfurt.

weisJ commented 2 months ago

Sure. I need to generate a custom output format that does not have the typical Graphics2D support and is quite limited. Therefore is was happy to see that your abstraction Output is fairly similar to my generic output interface, so the mapping was extremely simple. However, I need to output text as text elements so that they can be treated as such in the receiving application (e.g. for searching).

I see, thank you for the clarification. My statement regarding rendering was with bitmap formats in mind, which of course is not true for non-pixel based formats.

Actually I already spent some time going through the code and I do understand why you prefer the shape driven approach but as explained, it does not work for me. I was considering a secondary interface only for text support that could be treated as a marker interface and could be used instead of the shape output if implemented. That would be something I would try to handle in a fork If I decide to go down that road.

I have some ideas how to handle this. Will do some experiments and report back :)

weisJ commented 2 months ago

The newest snapshot adds a TextRenderer textRenderer() method to Output. As it is foreseeable that text rendering might need some reworking in the future TextRenderer has not much exposed to it. However I have also added a TextExtractingTextRenderer abstract implementation of TextRenderer, which can be used to consume all text content inside a <text> element. Let me know if this suffices in your scenario.

ma-ku commented 2 months ago

I have pulled the code and right now I am trying to wrap my head around the rendering process. A few questions:

How do I get the actual font settings for the rendered text? So far this seems all to be located in the TextContainer, right?
What is the purpose of the contextFontSize() function?
The types TextContainer, StringTextSegment, and TextSegment are package visible so I cannot use them to build out the rendering logic.

Thank you for your work on this. I guess with the information above, I can implement a rendering for text.

weisJ commented 2 months ago

How do I get the actual font settings for the rendered text? So far this seems all to be located in the TextContainer, right?

Font information is available through the RenderContext. I can make that available in the TextExtractingTextRenderer implementation. Do you need access to any other attributes e.g. x, y positions etc.?

What is the purpose of the contextFontSize() function?

The default font size for SVGs is user-agent dependent and usually inherits the font size of "surrounding elements" (e.g. if it is embedded in html content). In this case this is used to pass the font size of the current Graphics2D context being rendered to (if the Output is graphics based). If you return an empty optional, then the default fallback size of 10 will be used. Note that it will only be called once at the start of rendering.

The types TextContainer, StringTextSegment, and TextSegment are package visible so I cannot use them to build out the rendering logic.

This is on purpose as I expect text rendering might need some refactoring in the future to allow for more flexibility. I can't promise any API stability of these types hence they aren't exposed. I am aware that this greatly limits implementors of TextRenderer. However I am sure we can work out a suitable interface that works for you.

ma-ku commented 2 months ago

How do I get the actual font settings for the rendered text? So far this seems all to be located in the TextContainer, right?

Font information is available through the RenderContext. I can make that available in the TextExtractingTextRenderer implementation. Do you need access to any other attributes e.g. x, y positions etc.?

OK, if I receive all relevant information for text rendering in public abstract void processText(@NotNull String text) then I can take over on my side. I have seen that the RenderContext basically provides most of the information to setup the rendering target and amend the text rendering (baseline, anchor, and such things). If the text is not supposed to be rendered at (0|0), I would need of course the coordinates. I have seen that the rotation is already setup in the transform so that seems to be handled elsewhere?

What is the purpose of the contextFontSize() function?

The default font size for SVGs is user-agent dependent and usually inherits the font size of "surrounding elements" (e.g. if it is embedded in html content). In this case this is used to pass the font size of the current Graphics2D context being rendered to (if the Output is graphics based). If you return an empty optional, then the default fallback size of 10 will be used. Note that it will only be called once at the start of rendering.

Got it. Thank you.

The types TextContainer, StringTextSegment, and TextSegment are package visible so I cannot use them to build out the rendering logic.

This is on purpose as I expect text rendering might need some refactoring in the future to allow for more flexibility. I can't promise any API stability of these types hence they aren't exposed. I am aware that this greatly limits implementors of TextRenderer. However I am sure we can work out a suitable interface that works for you.

I think if I can handle the rendering as discussed above, I am fine. Well, the utter complexity of the typographic model of SVG might not be covered by that approach (yet) but it makes implementation easier for now.

weisJ commented 2 months ago

OK, if I receive all relevant information for text rendering in public abstract void processText(@NotNull String text) then I can take over on my side. I have seen that the RenderContext basically provides most of the information to setup the rendering target and amend the text rendering (baseline, anchor, and such things). If the text is not supposed to be rendered at (0|0), I would need of course the coordinates. I have seen that the rotation is already setup in the transform so that seems to be handled elsewhere?

Is your goal to truthfully replicate the visuals of the resulting SVG with in addition knowing at what location each letter is positioned? If so would it be workable for you to use the shape converted rendering for the visuals while receiving additionally the letter positions? This way you wouldn't have to worry about replicating the position computation for each letter.

The latest snapshot passes the RenderContext as described above. However I the answer to the question above is yes, then I think there is a better approach that could be taken.

ma-ku commented 2 months ago

Of course, the rendition should be 'as close as possible' to what you are rendering as shapes. But on the other hand, I would like to render as much glyphs as possible in one drawing as this would push rendering optimizations to the underlying graphics engine (like kerning and such things). So if it is a , I would like to get the position, styling, and the text itself.

weisJ commented 1 month ago

Well currently all text is rendered using "one call" as the shapes get merged during text layout. The real problem here is that svg allows for custom x and y coordinates per letter in <text> <tspan>. Also textPath exists. So ultimately the best I can offer you are single characters together with their transform and position. I don't quite see how you would do any kerning this way. Maybe you could elaborate a bit on the limitations of the your Output implementation, so I can understand a bit better what problems you are facing.

I have made some changes, which may suit your use-case better: TextRender is being replaced by TextOutput. The render pipeline signals to the TextOutput when text rendering starts and ends. It also passes every codepoint together with its current transform and RenderContext.

ma-ku commented 1 month ago

Well, the limitation of my 'Output Device' is that the resolution is extremely limited (it's a very special custom graphics format) and thus any vector graphics is limited to discrete 16 bit integer coordinates. While this is sufficient for the general graphics elements, it becomes a problem with glyphs as these are very narrow and the bezier control points typically detoriate the depiction when snapped. On the other hand does the format text rendering through the underlying display engine so I would prefer rendering "Hello World" in one phrase, eventually bold, italic, 12pt at position 123|456 rather than having each character individually for the above reasons. While having the character itself is already an enhancement over the shape approach, having the whole string would be the best.

And maybe for your understanding; I am not looking for a pixel-perfect rendition. I can live with these restrictions as long as I can render the majority of the symbols that I receive as SVGs.

I hope this gives you some insight into the limitations that I am facing. I had created my own SVG rendering engine but it was extremely limited and could not render all SVG files. Especially it had issues with different namespaces and styles so I prefer using your engine as it has much better support for these features.

I have just pulled the latest version and will see, if I can accomplish what I need with the latest changes. Thank you for your support so far.

ma-ku commented 1 month ago

One quick observation; I have pulled the latest code and suddenly no characters are rendered anymore. I have found out that in GlyphRenderer.java in line 78, the shape of the glyph contains no coords. Digging into the issue a bit further, I found out that none of the TextSpan nodes contained the actual text that was in the SVG file. I believe it is related to an issue with the new CodepointsCharacterIterator that fails to return the characters. I changed the method first() to reset the field index to zero and then the characters get rendered. I am sure that this is not the fix but it should lead you into the right direction to fix this.

weisJ commented 1 month ago

Thank you for the additional information. Maybe I can find a nice way to make consecutive characters (i.e those without any explicit placement using x, y or rotate) available to the TextOutput. For everything else the only sane way is to provide single characters. In this case you would have to decide yourself based on the transform whether you want to join them together. Note that space characters are also passed to the TextOutput, so you could always record the first transform, join all the characters and render the resulting string at that first transform in TextOutput#endText.

I believe it is related to an issue with the new CodepointsCharacterIterator that fails to return the characters. I changed the method first() to reset the field index to zero and then the characters get rendered. I am sure that this is not the fix but it should lead you into the right direction to fix this.

Thank you for digging into this. Seems like the first() and last() methods of CharacterIterator are meant to mutate the iterator, which I missed. Somehow this didn't create any issues with the JDK I am using.

weisJ commented 1 month ago

I have added TextOutput#glyphRunBreak() which signals that the end of consecutively laid out code points has been reached (in the sense that none of the glyphs specify a custom x,y or rotation). I have also added an GlyphRunTextOutput implementation which handles collecting the code points into strings of glyph runs.

ma-ku commented 1 month ago

I have figured out how to use that. I believe I can implement the majority of the renditions using my awkward output logic. The glyphRunBreak is very helpful, however you might consider to have beginText() and endText() final with delegate methods so that the housekeeping is not accidentially overwritten? I use these to block any shape rendering that might come from jsvg as I am taking over.

weisJ commented 1 month ago

Ah good catch. Yes they should be final indeed.

weisJ / jsvg

Text is converted to shapes #89