scripting / Scripting-News

I'm starting to use GitHub for work on my blog. Why not? It's got good communication and collaboration tools. Why not hook it up to a blog?
115 stars 10 forks source link

Markdown in feeds in WordPress #268

Open scripting opened 10 months ago

scripting commented 10 months ago

Back in July 2022, I wrote about supporting Markdown in RSS feeds. This is something FeedLand does, as a feed reader and editor. I would love to see all content editors support it as well as feed readers. One product in particular I'd like to support Markdown is Wordpress. The challenge is to figure out how to convert the HTML in a post to Markdown and then to get a source:markdown element in the corresponding item in the feed. Here's a checklist that explains how it works. If you have any interest in working on a plugin that does this please post something in this thread. I will host a WordPress site that runs the plugin as a proof-of-concept.

cjsparno commented 10 months ago

Dave, may be a bit rudimentary for your needs, but I have used the Jetpack plugin for Wordpress in the past (it has a lot of other nifty features as well making it a good addition to your Wordpress environment overall).

https://wordpress.org/plugins/jetpack/

Here is an article on enabling Markdown thru Jetpack.

https://wpengine.com/resources/using-markdown-wordpress/#:~:text=Switch%20over%20to%20the%20Writing,to%20add%20a%20Markdown%20block.

Thanks for all you do for the community, keep digging! -Chris

colin-walker commented 10 months ago

Gutenberg has the Markdown block so now natively supports Markdown without the need for Jetpack. So, at the most basic of levels I assume (I've not used WordPress in two and a half years) you could define a custom RSS feed template (I made one to factor for titleless posts) and use the Markdown source from the Gutenberg block — providing it's available (it would be surprising if it wasn't.)

If you're looking to go deeper then using a PHP library to convert the post HTML to Markdown would be the easiest way before inserting it in a custom feed template.

sbw commented 10 months ago

Sorry I don't have time to work on a plugin. But I agree it'd be a useful thing to do.

There are two parts to the plugin:

  1. Convert existing posts to Markdown.
  2. Add the Markdown for each post to the RSS.

I don't have any insight into number 2. But here are a few thoughts about number 1:

The following is based on my rather sketchy understanding of Wordpress and the Block Editor, sorry. I've worked with Wordpress a lot, but I've never had time to acquire any real Wordpress development skill. I hope this is still useful.

Wordpress sites used to store posts/pages in HTML. Now they are stored as a series of Blocks created by the block editor. At one time I found Wordpress can convert old HTML posts/pages to Blocks, but I don't remember whether it converts them all automatically, individually, or on command. (And I think I saw some poor conversion results.) Regardless, I'll assume that the plugin supports only Wordpress sites that have been upgraded to the Block Editor and all of the old HTML posts converted to Blocks.

Out of the box, the plugin converts all of the blocks delivered with Wordpress to Markdown.

The question is how to handle conversion of blocks added to the Wordpress installation by other plugins, themes, and so on. I wouldn't want the plugin to just ask each block to generate its HTML and then convert the HTML to Markdown. There are lots of HTML-to-Markdown converters out there (Random example from a DuckDuckgo search: Turndown). Years ago, when I found only one, it wasn't very good. Maybe they're better now.

But I wonder whether the internal rendering of Blocks can be "hooked" by a plugin to produce Markdown rather than HTML? I sorta doubt it: Sure, there might be helper functions like "paragraph" and "link" that can be hooked, but I'll bet many Blocks generate detailed HTML directly.

Ideally, the developer of the Block Editor had as one of their requirements: Blocks can render any markup, not just HTML, and here's the documentation on how to build any Block renderer you might need. Does anyone know whether they had that vision?

Sorry to post lots of ideas without any specific answers. Dave's question just got me thinking, as usual.

colin-walker commented 10 months ago

If using a library to convert HTML to Markdown I think the simplest solution would be to grab the HTML that would be put into a normal RSS feed, render that then reinsert it, it could be as part of a custom feed template. The trick is in finding a decent HTML to Markdown converter.

I've got a test WP install hanging around so might have a look if I get time.

scripting commented 10 months ago

colin, that's my thinking too. i hope you do look into it. ;-)

fmfernandes commented 10 months ago

The trick is in finding a decent HTML to Markdown converter.

There's PHP League's html-to-markdown which might be worth looking at as it supports the full Markdown syntax.

I've put together a simple plugin that uses that library and adds a <source:markdown> to an RSS feed in WordPress. Usually located at /feed.

This is mostly the source code:

```php true ) ); echo '' . PHP_EOL; echo $converter->convert( $content ); echo PHP_EOL . '' . PHP_EOL; } ```

Here's a zip file to download and install on any WordPress site (required PHP 7.4+).

What the plugin does is hook into rss2_item and print the corresponding <source:markdown> tag for that item. The content of the post (get_the_content()) is then passed to the HtmlConverter class which returns the Markdown syntax for the post content.

I didn't test how it works together if using a Jetpack Markdown block in the post content, though. But might be a good starting point. Feel free to host it, Dave.

colin-walker commented 10 months ago

That worked on a base WP install. Personally I would have gone with get_the_content_feed() but it doesn't seem to make much difference in my limited testing.

I tried it it with a Jetpack md block and things went a little wierd (the markdown wasn't converted in the content:encoded element and the source:markdown element had a preceding backslash.

After I disabled and re-enabled Jetpack and did a couple more tests it worked correctly. 🤷‍♂️

fmfernandes commented 10 months ago

I tried it it with a Jetpack md block and things went a little wierd (the markdown wasn't converted in the content:encoded element and the source:markdown element had a preceding backslash.

That's interesting! content:encoded is handled by WordPress core itself, so I wonder if it's a bug. I'll do some tests with the Markdown block, too.

colin-walker commented 10 months ago

That's interesting! content:encoded is handled by WordPress core itself, so I wonder if it's a bug. I'll do some tests with the Markdown block, too.

As I say, it was fine after deactivating/activating Jetpack again so not sure what was going on.

scripting commented 10 months ago

BTW, this is how Markdown flows through FeedLand.

https://github.com/scripting/Scripting-News/issues/269

The next step is to hook it up with Fernando's feed and make sure it works there too.