Closed Bibo-Joshi closed 8 months ago
This output can then be used to process the message further within Telegram or e.g. render the message in external programs.
It is very unlikely that you can use Markdown output in external programs. For internal usages in most cases it is much better to manually specify text entities instead of trying to construct corresponding Markdown/HTML markup. This is supported for more than 3 years now and should be used instead of text_markdown
/text_html
.
# These produce the same rendered result but different entities
Blockquotes and pre-formatted blocks should start on a new line and end before a new line. If they aren't then apps will still show them as if they are, but it is up to the app, how this is achieved. You may see a different number of empty lines in different places in different apps in the latter case.
Thanks for the swift reply.
It is very unlikely that you can use Markdown output in external programs.
Copy-pasting the code snippets from https://core.telegram.org/bots/api#formatting-options to both GitHub and StackOverflow, I can see that two widely-used interpreters can display most parts of TGs formatting options correctly without a need for changes.
In fact, the need to adapt/leave out some of TGs formatting options for processing in external programs actually highlights the use cases of methods like text_md/html
. If you want to display TG messages in reStructuredText, Ascii-doc, or other formats, you'll have to translate the entities into markup symols or at least in other entities-like datastructures that the external program can reliably understand.
For internal usages in most cases it is much better to manually specify text entities instead of trying to construct corresponding Markdown/HTML markup. This is supported for more than 3 years now and should be used instead of text_markdown/text_html.
I see a "in most cases" there 😉 A use case that I have in mind is adding a prefix-text to an existing message, where both the prefix-text and the message contain formatting entities. Having to shift the entities in the messages by a computed offset and calculating offset+length for the prefix-text, all in utf-16, is far more implementation effort than simply writing html/md_formatted_prefix + message.text_markdown/html
.
Nevertheless, if this is TGs official standpoint, we'll have to evaluate how much maintenance overhead text_html/markdown
is worth for us.
inline fixed-width code
pre-formatted fixed-width code block
pre-formatted fixed-width code block written in the Python programming language
Block quotation started\nBlock quotation continued\nThe last line of the block quotation--- Same on SO ![image](https://github.com/tdlib/telegram-bot-api/assets/22366557/f23a1c91-401f-4c8b-bd7c-aa53a2180938)
Blockquotes and pre-formatted blocks should start on a new line and end before a new line. If they aren't then apps will still show them as if they are, but it is up to the app, how this is achieved.
I would be very glad if this was documented at https://core.telegram.org/bots/api#formatting-options. This would alleviate at least some of the inconsistencies. Frankly, questions on the Bot API arising from shifting responsibility to the client without having that documented in the API is establishing itself as a pattern (See also #429 and #428) :/
can display most parts of TGs formatting options correctly without a need for changes.
But definitely not all of them. There will be multiple issues, for example, with bold/italic, blockquotes, or character escaping. Not even mentioning that classic Markdown and HTML are space-agnostic and Bot API isn't.
A use case that I have in mind is adding a prefix-text to an existing message
This can be easily done by shifting entities and library can provide a helper for that, which receives two text with entities and returns their concatenation. Implementation of such function is much simpler than trying to revert Markdown formatting.
I would be very glad if this was documented at https://core.telegram.org/bots/api#formatting-options.
This isn't a strong requirement, otherwise, it would be checked server-side. I also can't guarantee that non-official apps will correctly move the blocks to a new paragraph as intended.
But definitely not all of them. There will be multiple issues
I agree that displaying tg messages in external programs can not reliably work without the need for some adaption. But as explained above, this makes text_md/html
more valuable, not less valuable.
library can provide a helper for that, which receives two text with entities and returns their concatenation.
This still leaves the user with having to construct entity objects for the prefix and calling a method with 4 arguments, while string concatenation is a more simple operation.
This isn't a strong requirement, otherwise, it would be checked server-side. I also can't guarantee that non-official apps will correctly move the blocks to a new paragraph as intended.
If behavior of non-official clients is expected to have problems with block entities that do not start/end on a new line, wouldn't this be even more reason to document this, at least as a guideline?
In any case, it become clear that you don't see a need to improve the inconsistencies and we'll have to live with that.
In any case, it become clear that you don't see a need to improve the inconsistencies and we'll have to live with that.
There is no way to improve the way the resulting message may look, because this depend on the user's app. But consecutive block quotes without separating blank line could be supported in Bot API.
The ability to create consequent quotes using MarkdownV2 was added in Bot API 7.1 using a zero-length entity or separators between the entities. The corresponding examples were added to the documentation.
Hi there. First of all, let me say that I'm exited for the API 7.0 release! It brings functionality that's been eagerly awaited from the community The team of python-telegram-bot.org is currently working on integrating these changes into the python library. During this process, I noticed that the new block quote formatting options shows several inconsistencies in the handling of line breaks. The key points are
The second point is unfortunate, but I can understand that HTML is in general just more flexible than MD. The first point may not seem to bad on first glance, because a user can still generate the rendered result they like. However, the inconsistencies make it rather difficult to work with updating messages, parsing & combining content of several messages and similar use cases. As notable example, python-telegram-bot provides utility functionality that tages a
Message
object and computes text including formatting markers in the desired markup language, with the idea being thatproduces the same rendered result as visible for
message
itself. This output can then be used to process the message further within Telegram or e.g. render the message in external programs. The observed inconsistencies make the implementation hard for us (see also discussion in https://github.com/python-telegram-bot/python-telegram-bot/pull/4038)Let me describe the observed inconsistencies in code. The reference implementation is written for python-telegram-bot version 20.7, but ofc the same results can be achieved with plain HTTP requests.
Output:
Screenshot from Telegram Desktop Windows (Version 4.14.3 x64)
While investigating these inconsistencies, I became aware that some of them in fact already apply for the pre-formatted code blocks, which so far we just hadn't noticed. Let me demonstrate also this.
In the above reference implementation, replace the
data
withOutput:
Screenshot from Telegram Desktop Windows (Version 4.14.3 x64)
I'm aware that the formatting functionality can be viewed as "working as expected" and one can argue that parsing problems of the entities should be handled by any Bot API wrapper itself. Still, I want to point out these discrepancies to you and emphasize that IMO it consistent parsing (same rendered results have same entities & text) would improve the usability of the Bot API .