sebastiangreger / kirby3-commentions

A versatile comment system and integrated Webmention endpoint for Kirby 3.
MIT License
58 stars 7 forks source link

Make authorship algorithm more robust #25

Closed fabianmichael closed 12 months ago

fabianmichael commented 4 years ago

If you try to submit a webmention via the form and the entered URL does not contain an author, the cron job will fail with the following error:

Error: Undefined index: author

As the script aborts, the queue lock file is not cleared and the cronjob cannot be run again until this file has been deleted manually.

This is probably easily fixed by adding a simple array_key_exists() or similar, but I don’t know how this case should be handled. Should the mention be inserted with an Anonymous as author or be rejected?

sebastiangreger commented 4 years ago

@fabianmichael Happy to look into this and come up with a good fix; authorship discovery is rather complex, due to the variety in implementations of the microformats.

Would you be able to share the (markup of the) test file you used as the webmention source in your failed test? I tried with my test file and if I remove the h-card, the queue still processes it fine - the webmention is stored without an author (that should be the goal, I believe, to make it a robust mechanism).

As the script aborts, the queue lock file is not cleared and the cronjob cannot be run again until this file has been deleted manually.

The .commentions_queuelock file should only block the cron for 120 seconds, after which it is considered stale, ignored and deleted:

https://github.com/sebastiangreger/kirby3-commentions/blob/af748cb28c675cc34d3b1249de931f479eb1e0e4/classes/Cron.php#L58

It worked fine in my tests, but if there is indeed a problem with that, I'm happy to look into it? Maybe you just weren't aware of this detail, though?

fabianmichael commented 4 years ago

@sebastiangreger Okay, I was not aware that the lock file is cleared automatically after 120 seconds. Than it is nothing to worry about, because a single error cannot break the whole plugin, as I anticiated first.

The HTML payload of my test was is just:

<a href="http://commentions.test/rezepte/test">Eine coole Sache</a>

As soon as I added an h-card, using the example from the microformats wiki, it worked:

<p class="h-card">
  <img class="u-photo" src="http://example.org/photo.png" alt="" />
  <a class="p-name u-url" href="http://example.org">Joe Bloggs</a>
  <a class="u-email" href="mailto:joebloggs@example.com">joebloggs@example.com</a>, 
  <span class="p-street-address">17 Austerstræti</span>
  <span class="p-locality">Reykjavík</span>
  <span class="p-country-name">Iceland</span>
</p>

While having a page, that just contains a single link might not be a realistic real-world scenario, I can imagine a lot of situations, where author data is malformed or missing.

sebastiangreger commented 4 years ago

@fabianmichael hmm, a plain HTML file with nothing else (literally, not even <html><head> etc.) than

<a href="https://mytestdomain.tld/notes/exploring-the-universe">Eine coole Sache</a>

worked for me, testing between two live servers (not local). This could be related to you (apparently) testing on localhost? Or then I simply cannot replicate the error. I implemented your suggested array_key_exists check in a few places, just to be on the safe side (see my PR).

I tried the various markup samples from https://authorship.rocks/ and realized that 3/5 currently are not picked up by the plugin - so I'll look into that next, keeping this issue open as a reminder.

sebastiangreger commented 4 years ago

Ok, I added one more clause to deal with authorship indicated without h-card. This now passes tests 1-3 of https://authorship.rocks/ - the remaining two are about authorship info under a separate URL; that is a bigger task that I'd rather put on the roadmap (and would open a new "feature" issue for).