Rule Submission

Website: arstechnica.com

[x] I have made the rule as simple as possible ( K.I.S.S )
[x] I have run the rule myself for a period of time (1 month+) to spot any bugs

The regex is a bit ugly but does the job. Here's what an image gallery looks like in HTML (cleaned up a bit, placeholder text in []'s)

<ul>
    <li data-thumb="[tiny image url]" data-src="[fullsize image url]" data-responsive="[list of image urls followed by sizes]" data-sub-html="#caption-[caption id]">
        <figure style="height:[something]px;">
            <div class="image" style="background-image:url('[midsize image url]'); background-color:#000"></div>
            <figcaption id="caption-[caption id]">
                <span class="icon caption-arrow icon-drop-indicator"></span>
                <div class="caption">[some caption]</div>
                <div class="credit"><span class="icon icon-camera"></span>[some person]</div>
            </figcaption>
        </figure>
    </li>
    [many more <li></li>'s]
</ul>

The regex aims to pull out [fullsize image url] and [some caption] and convert them into the following format:

<figure><img src="[fullsize image url]"/><figcaption>[some caption]</figcaption></figure>

The regex explained:

<li.*? data-src="(.*?)".*?>             # match '<li [other attrs] data-src="url" [other attrs]>' and store the URL
\s*<figure.*?>.*?(?:<figcaption         # match the <figure><figcaption> tags
.*?<div class="caption">(.*?)</div>     # match the caption div and store the text inside it
.*?</figcaption>)?\s*</figure>\s*</li>  # match all the closing tags to reduce false positives

Notes:

The ? after the patterns signifies a non-greedy match so the regex will attempt to match as little text as possible.
The (?:<figcaption>[other stuff]</figcaption>)? part is an optional, non-capturing group. This means that if there is no caption the regex at least still matches the image url. Being non-capturing just means that it won't be made available in the replace phase.

feediron / ttrss_plugin-feediron

Update Arstechnica recipe to include image galleries #135

Rule Submission