copyhackers / airstory-wp

Send your blog posts from Airstory writing software to WordPress for publication.
https://wordpress.org/plugins/airstory
MIT License
4 stars 1 forks source link

DOMDocument::loadHTML() error causing blank posts #48

Closed stevegrunwell closed 7 years ago

stevegrunwell commented 7 years ago

From the WordPress.org repo:

The Airstory plugin can create a new blog post entry on my website and will fill in the Title area (H1).

However, this is where it stops. It will not import the text and images from my Airstory tab.

As far as compatibility, everything looks fine:

Screenshot of the Tools > Airstory page, showing all green

There are the following errors in the error log though:

[Tue Jun 27 13:30:35.804470 2017] [lsapi:notice] [pid 10341:tid 139660225398528] [client 74.58.238.6:34740] [host http://www.buzzandtips.com] Backend log: PHP Warning: DOMDocument::loadHTML() expects parameter 2 to be integer, string given in /home/buzzandt/public_html/wp-content/plugins/airstory/includes/formatting.php on line 41\n, referer: https://app.airstory.co/projects/p3464e3cf-cc19-486a-9321-841e9337dd3d
[Tue Jun 27 13:31:46.928377 2017] [lsapi:notice] [pid 10341:tid 139651006424832] [client 74.58.238.6:42053] [host http://www.buzzandtips.com] Backend log: PHP Warning: DOMDocument::loadHTML() expects parameter 2 to be integer, string given in /home/buzzandt/public_html/wp-content/plugins/airstory/includes/formatting.php on line 175\n, referer: https://app.airstory.co/projects/p3464e3cf-cc19-486a-9321-841e9337dd3d

Looking at the issue, it seems like it could be one of two things:

  1. Malformed content is causing DOMDocument::loadHTML() to choke, causing it to misinterpret the arguments. This would be addressed by the code merged in #45.
  2. The LIBXML_HTML_NODEFDTD and/or LIBXML_HTML_NOIMPLIED constants not being defined.

The fact that only LIBXML_HTML_NODEFDTD appears on both of the lines referenced in the error log leads me to think that an outdated version of libxml be the culprit, since PHP has this nasty habit of "oh, that constant's undefined so let's interpret it as a string literal" (which is consistent with the error messages).

According to the PHP documentation, the LIBXML_HTML_NODEFDTD constant is only defined in libxml 2.7.8, which was released in November of 2010. While I'd hope that the server isn't that far behind, it is worth adding the libxml version to the compatibility check within Airstory.

stevegrunwell commented 7 years ago

In the WordPress.org thread, the user confirmed that his site is running libxml 2.7.6, so it appears that the second condition is the case. I've advised he reach out to his host and inquire about an update, considering libxml 2.7.6 was released in October of 2009.

When I mentioned an older version of libxml being out in the wild, @jasondewitt pointed out that the server was likely running CentOS 6, which ships with an outdated version (specifically, 2.7.6) of libxml. There are ways to update it, but it will be dependent upon what the host is willing to do.

stevegrunwell commented 7 years ago

Heard back from Steve in the support thread, and it appears that a) CentOS 6 was indeed the culprit and b) his host was unable/unwilling to update to CentOS 7, a different distribution, and/or upgrade libxml. As a result, Steve will stick with HTML exports from Airstory and pasting them into WordPress.

While this answers the question of why it wasn't working for him (outdated version of libxml), it raises the question of should it work, even on an outdated version of libxml?

Some high-level numbers, powered by W3Techs:

I haven't been able to find solid numbers regarding the percentage of CentOS 6 vs 7 in the wild (at least, during a cursory Google search), but considering CentOS 7 was released in July of 2014, I'd expect a non-negligible chunk of that 4% of WordPress sites are running an up-to-date version of libxml.

If even 50% of CentOS-backed WordPress sites are still running CentOS 6, that's a maximum of 2% of WordPress sites, only a fraction of which are currently or likely to become Airstory users. In those rare cases, like Mr. Williams, users are still able to export from Airstory to plain HTML, then paste that content into WordPress. It's not as cool as "click a button and automatically send content from Airstory to WordPress," but users are still able to write in Airstory and publish in WordPress.

The plugin won't be useful for this small subset of users, but between the new compatibility check introduced in #49 (along with noting the requirement in the README files) and the fact that users are still able to manually move content between platforms, I recommend that we focus our attention elsewhere.