Open Tolker-KU opened 1 year ago
Hi @Tolker-KU!
Thank you for your nice words 😊
I think this was implemented by @gmischler in https://github.com/PyFPDF/fpdf2/pull/520: https://pyfpdf.github.io/fpdf2/TextStyling.html#subscript-superscript-and-fractional-numbers
I think it should work for multi_cell()
, but we currently only have unit tests for .write()
,
so extra unit tests covering multi_cell()
would be welcome!
I think this was implemented by @gmischler in #520:
<sub>
and <sup>
tags for write_html()
.However, the feature is not currently supported by our version of markdown.
The reason for the latter was that I couldn't find a standard on which characters to use as markup.
The most popular markdown variant commonmark doesn't support them either, for reasons that aren't entirely clear.
But then, since our own markdown variant is rather weird anyway (fundamentally incompatible with any others), we could theoretically chose whatever we want... I've seen ^x^
and ~x~
suggested most often, in our case it would probably make sense to double them like ^^x^^
and ~~x~~
to match the style of the existing tags.
I'm not very comfortable with borrowing tags from HTML. Why not just use HTML in the first place then?
Github accepting <sub>
and <sup>
HTML tags has little to do with markdown. It simply passes those through to the browser unchanged, just as it does with <b>
, <i>
, etc.
And while we're on the topic: Adding a conforming commonmark implementation (possibly in parallel) should probably be the long term goal.
Thank for getting back this quickly.
I'm looking for a feature to render subscripts and superscript within cells. As far as I can figure out this is not quite achievable with .write_html
. Or am I wrong here?
What do you about adding the ^^
and ~~
tags to the markdown syntax, so one can do .cell(txt="H~~2~~O")
-> H2O or .cell(text="E=MC^^2^^")
-> E = MC2?
I'm looking for a feature to render subscripts and superscript within cells. As far as I can figure out this is not quite achievable with
.write_html
. Or am I wrong here?
No, you are right.
fpdf2
currently does not support <sup>
& <sup>
tags inside <table>
:
from fpdf import FPDF
pdf = FPDF()
pdf.set_font("Helvetica")
pdf.add_page()
pdf.write_html(
"""<table border="1"><thead><tr>
<th width="33%">Name</th>
<th width="66%">Formula</th>
</tr></thead><tbody><tr>
<td>Lucas-C</td><td>E = MC<sup>2</sup></td>
</tr</tbody></table>""")
pdf.output("issue_860.pdf")
I agree that it would be nice if fpdf2
supported this usage! 😊
I would welcome a PR that implements this in HTML2FPDF
: https://github.com/PyFPDF/fpdf2/blob/master/fpdf/html.py#L195
I also fully agree with you @gmischler on this:
And while we're on the topic: Adding a conforming commonmark implementation (possibly in parallel) should probably be the long term goal.
Ideally, we could support combining fpdf2
with https://github.com/executablebooks/markdown-it-py
But then, would the translation chain be Markdown
-> HTML
, and then use FPDF.write_html()
?
This is not ideal, as our HTML2PDF converter is very limited: https://pyfpdf.github.io/fpdf2/HTML.html
So I'm not really sure of the path forward regarding Markdown support...
Ideally, we could support combining
fpdf2
with https://github.com/executablebooks/markdown-it-py But then, would the translation chain beMarkdown
->HTML
, and then useFPDF.write_html()
? This is not ideal, as our HTML2PDF converter is very limited: https://pyfpdf.github.io/fpdf2/HTML.htmlSo I'm not really sure of the path forward regarding Markdown support...
I think markdown-it-py
parses markup to tokens before rendering to HTML. Maybe fpdf2
can render the tokens directly to PDF instead of using HTML as an intermediate step.
https://markdown-it-py.readthedocs.io/en/latest/using.html#the-token-stream
I think
markdown-it-py
parses markup to tokens before rendering to HTML. Maybefpdf2
can render the tokens directly to PDF instead of using HTML as an intermediate step.
Sure, we could do that! But then we will basically have to maintain a new "Markdown2PDF" class 😅
I'm not opposed to this, if someone is willing to contribute / initiate such converter to this project,
and if it is mostlty compatible / does not break too many existing behaviours of fpdf2
.
I'm looking for a feature to render subscripts and superscript within cells. As far as I can figure out this is not quite achievable with
.write_html
. Or am I wrong here?No, you are right.
fpdf2
currently does not support<sup>
&<sup>
tags inside<table>
:from fpdf import FPDF pdf = FPDF() pdf.set_font("Helvetica") pdf.add_page() pdf.write_html( """<table border="1"><thead><tr> <th width="33%">Name</th> <th width="66%">Formula</th> </tr></thead><tbody><tr> <td>Lucas-C</td><td>E = MC<sup>2</sup></td> </tr</tbody></table>""") pdf.output("issue_860.pdf")
I've been looking into how to solving this. It seems that cells in tables rendered from HTML call FPDF.multi_cell()
.
https://github.com/PyFPDF/fpdf2/blob/54d2eb0266bd3b1ccbf4dc384ea46c9b0d6b718d/fpdf/table.py#L278-L293
As far as I can see FPDF.multi_cell()
has no ability to render text with mixed vpos. One idea is to expose something like _render_styled_text_line()
on FPDF
that takes a TextLine
which support text fragments with different styling. Could that be a way forward?
As far as I can see
FPDF.multi_cell()
has no ability to render text with mixed vpos. One idea is to expose something like_render_styled_text_line()
onFPDF
that takes aTextLine
which support text fragments with different styling. Could that be a way forward?
As you have correctly recognized, this is a fundamental limitation of multi_cell()
.
For formatting changes within a paragraph, there is the alternative write()
, but that currently has the disadvantage that it can only create left-aligned text.
Fixing this cleanly requires some architectural changes to fpdf2. I have outlined a possible solution in #339, and have been working on-and-off on an actual implementation. I hope I'll find time again soon so I can actually show some more progress here.
Theoretically, write_html()
could also get more low-level access to the fpdf.py internals as you suggest, but I think a more general high-level approach to text formatting is better in the long run. Several similar issues have been raised over the last year, which all correctly pointed at the same set of current limitations. I'm sorry to say that the necessary groundwork for a true and general solution will take a bit more time.
By the way, I think that this other, older issue is related: https://github.com/PyFPDF/fpdf2/issues/151
Regarding the initial question about Markdown, combining fpdf2
with mistletoeo
can be a good alternative approach: https://py-pdf.github.io/fpdf2/CombineWithMistletoeoToUseMarkdown.html
I renamed this issue into: write_html: support <sup> & <sup> tags inside <table>
in order to clarify what the current feature request is 🙂
For clarity, just repeating the minimal code snippet that we are looking to support:
from fpdf import FPDF
pdf = FPDF()
pdf.set_font("Helvetica")
pdf.add_page()
pdf.write_html(
"""<table border="1"><thead><tr>
<th width="33%">Name</th>
<th width="66%">Formula</th>
</tr></thead><tbody><tr>
<td>Lucas-C</td><td>E = MC<sup>2</sup></td>
</tr</tbody></table>""")
pdf.output("issue_860.pdf")
Since PR #897 by @gmischler, HTML2FPDF
is better architectured and now uses .text_columns()
& paragraphs to render text. This should now ease the implementation of this feature.
Hi,
Thanks for all the great work going into this project!
I wonder if you have considered supporting subscript/superscript in cell/multicell when styling text with markdown?
Github supports this in their markdown implementation using the HTML tags \<sub>/\<sup>. I imagine fpdf2 could do something similar.
If you think this is a good idea, I would be happy to take a crack at it. It seems that the machinery for this feature already is in place.