Closed MichaIng closed 2 years ago
What is not 100% clean is that post_text
is within <t>...</t>
tags, the post_subject
is not.
$description
variable then looks like the subject: <t>the body</t>
.<t>the subject: <t>the body</t></t>
.<t>...</t>
is assured to be present to correctly inject the subject.Hi :wave:
Thanks for your contribution, I'll review the changes and give you feed back.
However, probably there is a cleaner way to add the subject inside the tags, so that XML syntax remains valid.
It could be done using DOM, so the title could be inserted inside the <t>
and </t>
tag.
You mean at a later stage, when applying the strings to the final meta tags? The title is always the threads subject, in case prefixed with the post number, but not the posts subject. I though the posts subject would suite better since it can be freely chosen, hence can be completely different than the threads title and would then be more related to the posts body it prefixes directly. But I'm open to adjust the PR in any way. Major aim is to silence Bing webmaster SEO issues in case of short posts 🙂.
As alternative I can also try split $description
in a generic way after any leading tag, or using the regex ^<[rt][ >]
you use in clean_desription()
. I'm just not sure whether it's worth it, given that the result is correct already.
You mean at a later stage, when applying the strings to the final meta tags?
No, in the same way you are using it right now, however instead of concatenating a string (subject + description), it would be added directly inside the description XML using DOM.
The title is always the threads subject, in case prefixed with the post number, but not the posts subject.
My bad, I meant to say subject instead of title. Because I agree that the title has some text we don't need.
Major aim is to silence Bing webmaster SEO issues in case of short posts
And I like the idea, so this is definitely something I want to add.
As alternative I can also try split $description in a generic way after any leading tag, or using the regex ^<[rt][ >] you use in clean_desription().
Sure, I suggested DOM because, in my opinion, that's the way to manipulate XML. I'm trying to use regex as less as possible because it can get messy really fast.
However you have creative freedom to decide which method is best :+1:
I'm just not sure whether it's worth it, given that the result is correct already.
I would prefer to do it "the right way" to avoid problems with nested <t>
tags with other extension.
I'm aware of at least 2 extension (not counting my own) that extends this one, so I try to avoid creating a conflict somehow.
Ah okay, sorry for the misunderstanding, since I'm no professional PHP developer, I don't know yet how to "use DOM" here 😅, but it sounds like the best solution indeed. If you have a hint, let me know, in the meantime I'll try to find some docs to figure this out myself 🙂.
I think I found it: https://www.php.net/manual/en/domcharacterdata.insertdata.php
Will try it out locally and in case update the PR tomorrow.
I don't know yet how to "use DOM" here :sweat_smile:, but it sounds like the best solution indeed. If you have a hint, let me know
Oh.
A very easy way to do it is to create a new DOMText
node using DOMDocument::createTextNode() and then add it to the root node of the XML (either <r>
or <t>
) using the DOMNode::insertBefore() method.
You could do something like (not tested):
$description = $row['post_text'];
// Load the XML document
$dom = new \DOMDocument;
$dom->loadXML($description);
// Prepend the post subject in the XML of the post body
$dom->documentElement->insertBefore(
$dom->createTextNode($row['post_subject'] . ' '),
$dom->documentElement->firstChild
);
// Save modified XML
$description = $dom->saveXML($dom->documentElement);
Works very well. I implemented it as you suggested.
Sorry for the dealy, I've been dealing with some issues.
But such short post texts are very rare, and probably selecting both values marginally decreases performance (increases load) for all other post requests?
In my test the difference is insignificant (~ 0.012 ms on each), so don't worry about that.
Besides, if my guesses are correct, I might get a optimization validation note for that second SQL query.
Besides, if my guesses are correct, I might get a optimization validation note for that second SQL query.
You mean by this, a not from CI/CD pipeline to optimise this? Ready to be merged from my end btw, I have this deployed on our phpBB instance since the last push and reviewed a few short posts/threads where worked as intended while not touching longer ones.
You mean by this, a not from CI/CD pipeline to optimise this?
No, the validation team reviews the code and make notes on it.
Some notes might get the extension to be rejected, so that's why I'll change that second query.
Ready to be merged from my end btw
Ok.
I'll merge it as is and make some changes to it.
Thanks! :+1:
Ah wow, I didn't know about this. Nice on the one hand, but otherwise it may be annoying 😄. I was wondering why so many apps are not provided via official extensions database, and now I start to think that strict reviews may be it. On the plus side is that that way we can better trust apps in the extensions database and the same way may less trust apps not there, since probably they are not well coded and therefore were rejected 😛.
The post body text can be very short, in which case SEO recommendations may not be met, recommending description meta tag lengths of at least 25 characters. This change prefixes the description with the post subject, in case the body text is less than 25 characters long.