The above line can be changed to /(?<=src=")((http.*?\.)(jpe?g|png|[tg]iff?|svg))(?=")/ to fix this.
The changes I've made is adding a positive lookbehind for src=" to only match images, and not URLs in general, also adding a positive lookahead for " to find the end of the src attribute, and finally making the .* to match any character non-greedy (so it won't match many image tags on one line.
At the moment, the social image generation will fail with multiple image URLs in the original post, for example:
The current regex (below) will match from the beginning of the href attribute to the end of the src attribute, resulting in the unreadable meta tag:
https://github.com/v17development/flarum-seo/blob/d814c00d7c66cb00f8391e793597aed5947de0f9/src/Listeners/PageListener.php#L477
The above line can be changed to
/(?<=src=")((http.*?\.)(jpe?g|png|[tg]iff?|svg))(?=")/
to fix this.The changes I've made is adding a positive lookbehind for
src="
to only match images, and not URLs in general, also adding a positive lookahead for"
to find the end of the src attribute, and finally making the.*
to match any character non-greedy (so it won't match many image tags on one line.Tests: https://regexr.com/5701f
Do note that this will no longer match images only linked to in Markdown (
[A super cool image](https://example.com/image.png)
)@jaspervriends