Closed MaluNoPeleke closed 3 years ago
I tried it with $tweet['text'] = utf8_decode($tweet['text']) but no luck
One additional note: The text itself renders correct but the tags don't work the same and links are cut off when there are umlauts.
Maybe it could be achieved with the addslashes() and stripslashes() function for the link generation part in the Twitter.php src file!?
@CohortHS Your comment isn't visible here, even if I have received it via email.
@MaluNoPeleke would you mind pointing to a test repository / setup so I could reproduce this and see if I can provide a PR?
Thanks a lot for your offer, really appreciated. As this would require to expose my API key and secret I rather would not like to share it. If you have this plugin already running in your environment then you could test with this snippet:
<?= snippet('twitter/tweet', ['id' => '1403092300981489673', 'media' => true])?>
Does this help you?
I'd have to check what it takes to get one of those .. Twitter Account thingies x) but I'll manage 😀
That would be great, thanks a lot in advance!
I "applied" for developer access, which makes me think why you need a developer account "just" to display some tweets elsewhere - but I'm willing to dig into this rabbit hole.
In the meantime, may I suggest Mastodon - there it's as easy as "Add application" > name, scope > "there you go, here's your access token etc"
Seems to me that you might be re-encoding the string as it seems to be UTF-8 already .. since I cannot review your actual code, it's hard to say what went wrong - see above screenshot taken from a fresh plainkit.
From what I can see, this problems roots in the linkify()
department, and is not an issue of text encoding, see this test:
Conclusion: As this problem clearly doesn't add up to 'missing UTF-8 support in full_text
', I suggest renaming the issue or closing it and opening a new one, so people don't get confused :fox_face:
.. to something like 'linkify
mangles hashtags with umlauts' or something ..
Thanks a lot for your investigation. I changed the title of the issue. Do you have an idea how to fix it? It seems as this has already been fixed ten years ago: https://github.com/mozilla/bleach/commit/709376b4730d668ddc3ee26299b1721a1ab86f80
?!
what's mozilla/bleach
doing here?
That's where @HashandSalt (supposedly) got the function from:
https://gist.github.com/jasny/2000705#gistcomment-3888015
I commented there, we'll see what the regex voodoo people come up with :grin: In the meantime, I'll investigate further ..
Looks like someone else had the same issue four years ago and didn't receive an answer: https://gist.github.com/jasny/2000705#gistcomment-2123490
This linkify stuff is some crazy regex. Beyond me. Thanks @S1SYPHOS for look into this
this regex '~(?<!\w)[@#](\w++)~'
doesn't detect the ä
character ...
this does: '/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u'
test it, and behold the UNICODE
@MaluNoPeleke
https://github.com/HashandSalt/kirby3-twitter/blob/main/src/classes/Twitter.php#L96
$value = preg_replace_callback('~(?<!\w)[@#](\w++)~'
becomes $value = preg_replace_callback('/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u''
, please tell me if that's better for you
this regex
'~(?<!\w)[@#](\w++)~'
doesn't detect theä
character ...this does:
'/#([\p{Pc}\p{N}\p{L}\p{Mn}]+)/u'
I'm no regex voodoo person, but I know a thing or two - and I know how to duck-duck-GO
Need to test with tweets containing more than one hashtag tho
edit: yep, works fine
Thanks a lot but it fails for me:
syntax error, unexpected '', function ($match) use (&$li' (T_CONSTANT_ENCAPSED_STRING), expecting ')'
Maybe I copy pasted something wrong!?
@MaluNoPeleke I guess, what about the version @HashandSalt just merged?
I guess it'd be wise to publish a patch release, @HashandSalt what do you think?
It works for me as desired with composer require hashandsalt/kirby3-twitter:dev-main
and I think it is fine to release it "officially".
Thanks a lot @S1SYPHOS :)
@MaluNoPeleke Es war mir ein Vergnügen, Herr Niedernolte!
Translation for @HashandSalt, you didn't learn german in school (why would anyone ..): "You're welcome, Mr. Niedernolte"
it seems as if the full_text isn't encoded in UTF8:
Could you please add that to the plugin?