Is your feature request related to a problem? Please describe.
My understanding is that currently we store a post twice in the database: once as the raw Markdown, and also as the HTML that it is converted to. This works, but causes some inconveniences:
More space required for the database: The HTML version of every revision in the edit history is stored forever, even though only the most recent revision is used (and if needed in future it can easily be recreated from the raw Markdown).
Confusion for users: Post size limits need to take into account the size of the HTML, which is invisible to the user. This means the character count when editing a post does not match the restrictions (a post may say 29,000 characters but still complain that 30,000 has been exceeded). This is issue #958
Limits need to be pessimistic: If the database field is limited to 65,536 characters, then the limit imposed on users needs to be significantly less because some Markdown can lead to much longer HTML (or otherwise we have the confusing error message from the previous bullet point). Relevant to #582
Describe the solution you'd like
Store only the raw Markdown in the database, and have caching that prevents the need for the server to recalculate the HTML each time the page is served.
Less space required for the database.
No confusion for users: If the character count is less than 30,000, the user will see no error message.
Limits will be exact. If a community wants to allow longer posts, such as for reviewing a chapter of a novel or long piece of code, the number chosen is limited only by the database field, not by having to guess how long the Markdown can be before the HTML exceeds the database limit.
Any time that caches are emptied, the server will need to recalculate the HTML for each page served, but only the first time for each page. Is there any reason to think this would be too much load?
Unlike the current approach, this would allow discarding the cached HTML for posts that have not been viewed for a long time. Old posts would not need to have their HTML stored, but would still be viewable if necessary. Old revisions of a post which are never going to be viewed as HTML again would expire from the cache.
In most cases, the HTML would be calculated at the point of saving the post (as currently), so the only difference is where the data is stored.
Is your feature request related to a problem? Please describe. My understanding is that currently we store a post twice in the database: once as the raw Markdown, and also as the HTML that it is converted to. This works, but causes some inconveniences:
Describe the solution you'd like Store only the raw Markdown in the database, and have caching that prevents the need for the server to recalculate the HTML each time the page is served.
Any time that caches are emptied, the server will need to recalculate the HTML for each page served, but only the first time for each page. Is there any reason to think this would be too much load?
Unlike the current approach, this would allow discarding the cached HTML for posts that have not been viewed for a long time. Old posts would not need to have their HTML stored, but would still be viewable if necessary. Old revisions of a post which are never going to be viewed as HTML again would expire from the cache.
In most cases, the HTML would be calculated at the point of saving the post (as currently), so the only difference is where the data is stored.