feediron / ttrss_plugin-feediron

Evolution of ttrss_plugin-af_feedmod
https://discourse.tt-rss.org/t/plugin-update-feediron-v1-2-0/2018
MIT License
204 stars 34 forks source link

Error: preg_match(): Passing null to parameter #2 ($subject) of type string is deprecated #182

Closed eric-pierce closed 2 years ago

eric-pierce commented 2 years ago

Seeing these errors in console in TTRSS:

preg_match(): Passing null to parameter #2 ($subject) of type string is deprecated
1. plugins.local/feediron/init.php(287): preg_match(/charset=(\S+)/)
2. plugins.local/feediron/init.php(219): getArticleContent(https://hackaday.com/2022/08/13/air-filter-drm-hacker-opts-out-with-nfc-sticker/, [{"type":"xpath","xpath":"article","cleanup":["ul[@class='sharing']","ul[@class='share-post']"]})
3. plugins.local/feediron/init.php(102): getNewContent(https://hackaday.com/2022/08/13/air-filter-drm-hacker-opts-out-with-nfc-sticker/, [{"type":"xpath","xpath":"article","cleanup":["ul[@class='sharing']","ul[@class='share-post']"]})
4. classes/pluginhost.php(354): hook_article_filter([{"owner_uid":2,"guid":"2,https:\/\/hackaday.com\/?p=547880","guid_hashed":"{\"ver\":2,\"uid\":2,\"hash\":\"SHA1:145d5887e7e58896d8fa0908636e31dad544e7c1\"}","title":"Air Filter DRM? Hacker Opts Out With NFC Sticker","content":"<div><img width=\"800\" heigh...)
5. classes/rssutils.php(826): chain_hooks_callback(hook_article_filter, {Closure}, [{"owner_uid":2,"guid":"2,https:\/\/hackaday.com\/?p=547880","guid_hashed":"{\"ver\":2,\"uid\":2,\"hash\":\"SHA1:145d5887e7e58896d8fa0908636e31dad544e7c1\"}","title":"Air Filter DRM? Hacker Opts Out With NFC Sticker","content":"<div><img width=\"800\" heigh...)
6. update.php(238): update_rss_feed()
{
    "hackaday.com": {
        "type": "xpath",
        "xpath": "article",
        "cleanup": [
            "ul[@class='sharing']",
            "ul[@class='share-post']"
        ]
    },
    "explosm.net": {
        "type": "xpath",
        "xpath": "div[contains(@id,'comic-wrap')]"
    }
}
dugite-code commented 2 years ago

I had a quick glance, I thought that preg_match was running on the $html for some reason it's using the empty $content_type variable. I'll have to give it a deeper look and figure out if that was deliberate or a mistake.

Boris-de commented 2 years ago

I just ran into the same warning. To me it looks like it wanted to parse the $content_type not the $html, otherwise it would not make sense to even return the content-type from get_content() because that function is only use in this single place, right? Also it was doing the opposite before 70de0d5.

Maybe it should be something like

if ($content_type) {
  // Match charset from content_type header
  preg_match('/charset=(\S+)/', $content_type, $matches);
  if (isset($matches[1]) && !empty($matches[1])) {
    $this->charset = str_replace('"', "", html_entity_decode($matches[1]));
    Feediron_Logger::get()->log(Feediron_Logger::LOG_TEST, "Matched charset:", $this->charset);
  }
}
if (!$this->charset) {
  // no encoding from content-type -> attempt to detect encoding of html directly
  ...
dugite-code commented 2 years ago

@Boris-de Yes you are correct, it was supposed to try pull the charset from the $content_type header first then fall back to using the mb_list_encodings function. Thanks for pointing that out.

eric-pierce commented 2 years ago

Thanks @dugite-code - upgraded my local install and confirming that I see no errors being thrown so far with several hackaday (source that was triggering issues) articles being published. Thanks for the input @Boris-de!