tumblr / docs

Tumblr's public platform documentation.
Apache License 2.0
108 stars 26 forks source link

Inconsistent handling/processing of special characters #121

Closed polarstoat closed 6 months ago

polarstoat commented 9 months ago

When using the client.editLegacyPost method, editing a post that has a body containing unescaped special characters (like é, ä, ½, ʻ, , etc.) causes 400 Bad Request errors.

Error: API error: 400 Bad Request
    at IncomingMessage.<anonymous> (~/repository/node_modules/tumblr.js/lib/tumblr.js:339:13)
    at IncomingMessage.emit (node:events:529:35)
    at endReadableNT (node:internal/streams/readable:1368:12)
    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)

Node.js v18.18.2

If these special non-ASCII characters are then encoded (as &eacute;, &auml;, &frac12;, &#699;, &hellip;, etc.) as HTML4 named references, then they are accepted by the API successfully (note that for some reason HTML5 named references do not then render correctly on the Tumblr blog, XML references also work as an alternative to HTML4 though).

However, when then retrieving the post back with the client.blogPosts method with the given post ID and options of { filter: 'raw' }, the post body is returned with these special characters unescaped. This is causing problems for me in detecting changes and whether an update needs to be made to a Tumblr post to keep it synchronised with another data source.

As per the docs, I'd expect using the raw filter to return the data "As entered by the user (no post-processing)".


This issue was originally opened in tumblr/tumblr.js#242, but I was asked to open it here instead.

polarstoat commented 6 months ago

It turns out this was an issue with the tumblr.js JavaScript client rather than the API itself. The issue is fixed by tumblr/tumblr.js#299