Closed bph closed 1 year ago
Thanks @bph !
The Dev Note has been published: Introducing the HTML API in WordPress 6.2
Will work on this in the coming week: goal is to have something together, a reasonable draft, by next Friday, April 21
After discussing this with @justintadlock, I am going to pick up this article.
Just noting that @dmsnell seems to be doing this for now.
I'd like to note one thing (if this isn't the right place for that, please excuse me): I always find the examples for the tag processor "confusing", because they are using $p
as a variable for the processor/html. I always expect, that it's modifying a paragrah <p>
tag, because of the name of the variable. It's definately a minor thing, but I think using clear variable names would make it easier to follow the examples.
So in may usage, I often use $html
for the tag processor instance and then when modifying specific tags i use the tag as a variable name - eg: $a = $html->next_tag('a');
or $div = $html->next_tag('div');
thanks @gaambo - we deliberated a lot when designing the examples and started with the more verbose $tag_processor
name, but it got a bit heavy-feeling. we shorted it to $p
for $processor
I encourage folks to avoid storing the tag processor itself or the result of next_tag
into names relating to HTML tag names, because that's not what they give you. next_tag()
only tells you if it found a given tag, so that $div
doesn't refer to any specific DIV
element. maybe $found_a_div = $html->next_tag('div')
is more descriptive there.
Do you find $processor = new WP_HTML_Tag_Processor( $html )
clearer than $p
? $p
has a nice terse cadence to it, though I can see where it might be confusing if the assumption is that it's storing some kind of tag.
Same caution exists for calling it $html
, because $html
is both a string and the name of the input HTML argument to the processor. It's given HTML and creates a Processor.
Naming is hard 😄
Thanks for the explanation - tht makes sense, and in this case my examples were actually bad 😅
I, personally, think $processor
is more more descriptive and better. But, as I said, it's a minor thing, but just wanted to mention it because it really confuses me in examples 😅
Hello all, I've reached a first draft of a brand new version of the post.
Happy to have any review you might want to share. As a first draft, I could end up throwing everything out. That being said, the most valuable kinds of feedback will revolve around how the information is organized, things you think might be missing, things that are unclear, and things that you think should be communicated differently.
Thanks for all your collaboration on this!
@zzap @marybaum Would you have time to review @dmsnell draft on Google Doc
Happy to! Can’t for about four hours though — D has a procedure this morning
Hi @dmsnell, thank you so much for writing about this. I'm both personally and professionally very interested in this topic and can't wait for the article to be published.
Overall, I love how it educates us on both Tag Processor and regular expression. I like the flow and structure.
I have a few suggestions:
HTML without Regular Expressions
the new title? If so, I would add Tag Processor: HTML without Regular Expressions
or HTML API:...
or something like that. My fear is that people will misjudge and skip it based on that title. For example, the first thought of HTML is not trying to match it with regular expressions. It's doing frontend. Missing the API name will leave people unaware of the article's topic.Let's consider a few goals that can easily be achieved with Tag Processor
. The reason is: in the previous paragraph, you talked about the reason why API was introduced in WordPress, and now when I see Goal
, my first reaction is that this was the goal for getting API in the core.Taking a step back there’s another problem in view: even if in this one spot the regular expression pattern were complicated enough to understand all of the HTML syntax and all of the semantic processing rules, it would still be one of thousands of places where the code is unique and behaves differently than other HTML processing code and needs its own separate maintenance and bug tasks.
$p
being confusing.render_block
example is done with closure, which is officially not recommended by wp.org because they are impossible to be removed. Is there any specific reason you used it that way, and if not, can you write it with a callback?This is durable code because it’s expressing what it seeks to accomplish.
Tiny thingies:
use
twice. Is there a better way to say it?greedyand
were
complicated enough to understand"As for use cases, I had the pleasure of seeing Tag Processor being used to remove nofollow attributes from anchors with specified domain names. It's pure joy, and we updated WordPress to 6.2 only to have that available.
Thank you again for writing this article :heart:
I had the pleasure of seeing Tag Processor being used to remove nofollow attributes from anchors with specified domain names. It's pure joy, and we updated WordPress to 6.2 only to have that available.
aw. thanks for melting my heart @zzap
Is HTML without Regular Expressions the new title?
Yes, and in fact I wanted to make this more about parsing HTML and Regular Expressions than about the Tag Processor. That is, I've been walking the line carefully about trying to avoid communicating that people ought to find a way to use the Tag Processor - I don't want to convey that through this post. I'd rather address the problem where it's at, "trying to do things in HTML and reaching for regular expressions or DOMDocument
, and then show how this API can do it better. In other words, let the problem lead the solution instead of broadcasting this as "you should be using the HTML API."
Any thoughts on how to mix your feedback with this?
Let's consider a few goals that can easily be achieved with Tag Processor when I see Goal, my first reaction is that this was the goal for getting API in the core.
I'd be interested in hearing a re-wording of this. On one hand, it sounds like you appropriately understood the reason this API was introduced - mission accomplished 😆
This sentence is a bit long
Thanks, I'll work on that one.
Is there any specific reason you used it that way
Nope, just trying to avoid distracting from the topic at hand. Happy to change that.
Plural should be singular in "even if in this one spot the regular expression pattern were complicated enough to understand"
I need your help identifying the plural here. If it's the highlighted were
that's not plural, it's the appropriate English subjunctive case, in contrast to the often misused even if it was the case
.
I'll take a deeper look at all this feedback after I get back from some vacation. Thanks for reviewing the doc!
I need your help identifying the plural here. If it's the highlighted were that's not plural, it's the appropriate English subjunctive case, in contrast to the often misused even if it was the case.
Ahh, it was my bad English. Thank you for explaining it :heart:
Any thoughts on how to mix your feedback with this?
OK, I hear your point. In that case, I'd add the word parsing
somewhere. E.g. Parsing HTML? Forget the Regular Expression
I'd be interested in hearing a re-wording of this. On one hand, it sounds like you appropriately understood the reason this API was introduced - mission accomplished laughing
Well, I would just add some small intro for examples there, as a bridge between the article intro and actual examples. Because the first goal wasn't the only goal for getting API into the core. It's nothing major, just feels a bit disconnected, like two independent pieces of content were put one after the other. I don't know if I'm explaining this properly.
Thanks again @zzap - made another round of edits and updated the document. In places where you mentioned it was hard to read I changed the sentences around to avoid the problems.
the first goal wasn't the only goal for getting API into the core
Maybe it's good to talk about this more here together. The two examples in the document are basically the leading motivations for the introduction of the HTML API. Even though many more applications are possible now, its inception is quite modest.
It's completely possible I'm being quite ignorant here though and am overlooking something obvious.
In any case, at this point in time, my primary hope is that people at large only start adopting the HTML API for very limited cases like the examples. I want to see continued development and use of the HTML API, but in a way that fosters communication back to its development and informs its needs.
@dmsnell - Can change the share settings for the doc to "Anyone with the link" can comment?
@justintadlock done.
Thanks. I did a full read through it and really enjoyed what you came up with. I left a few minor in-doc comments. Other than those, it looks good to me.
Thanks @justintadlock - done.
@dmsnell it reads really well. I did another sweep of commas in and out, and other minor suggestions.
In your email space for your .org account you should find an invitation to the developer news blog with a request to accept. Once you feel you are ready, move your text there, and share the Public preview link in a comment below.
Here are two checklists:
Some of it is self-explanatory, some is not. I would be happy to walk you through it. It's your first post and I can't be more excited for you to be a part of the Developer blog crew :-)
A general guide:
most commas belong in pairs, unless they’re setting off a clause.
Also, here are two proper ways to identify a person or a group:
The FSE outreach director, Anne McCarthy, said the group…
FSE Outreach Director Anne McCarthy said the group…
If you use a comma, you need the phrase to start with an article (a, an, or the).
If you don’t use an article, then lose the comma.
from my phone
Thanks @marybaum for all your comments. I'd love for you to get credit for all that writing. Would you like to create a draft post so you can be the author? I'll copy the contents from the Google Doc into that if you do.
@marybaum I've adopted all your suggestions in the document but left a few questions where it seems like we've intentionally changed what's being said in a way that's factually inaccurate.
would love to see you get the credit for this writing. @bph apart from copying the contents from the Google Doc into a draft post do you need me for anything else on this? I'm extremely excited to put this behind me.
It would be great if you could include the pre-publish checklist when you copy/paste as much as you can. But that: would be it. I will push it over the finish line 🥰
I’m so flattered by all this! I only got about halfway through, and I’m afk today and tomorrow. We’re in Colorado for some mountain tennis and pickleball, but will have some downtime while Dick drives us to some mountain destinations the next few days after that.
@dmsnell Let me know how I can assist getting this over the finish line, maybe even this week?
@bph let's get that post created with @marybaum as the author. I'm happy to do all the work copying the content into it and reformatting. I'd do this but I don't think I can change the author if I create the post.
Ah! Tonight’s activities!We’re headed to another national park today, then I’ll do the undone half of that at the hotel after.Meanwhile the photos await…... from my phoneOn Jul 18, 2023, at 9:48 AM, Dennis Snell @.***> wrote: @bph let's get that post created with @marybaum as the author. I'm happy to do all the work copying the content into it and reformatting. I'd do this but I don't think I can change the author if I create the post.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
Finished editing the second half. I'll pull it into the p2 tomorrow.
I lied. I'll do that Thursday or Friday.
Or two months later. Oops.
Welp—it is in the p2. https://developer.wordpress.org/news/?p=2023&preview=1&_ppp=756268466a
Thank you so much @marybaum 🙌
@dmsnell could go over this public preview one last time? It's going to be published on Wednesday morning.
I went over the pre-publish list:
-[ ] added a "Resources to learn more" section.
Both of you @marybaum & @dmsnell
What do you think about the excerpt: "All by itself, the HTML Tag processor is better than regular expressions. It's convenient, reliable, fast—and You. Can. Read. It. This article shows you in two examples how to get started using the HTML Tag processor."
Do you think we need a Table of Contents?
I think that excerpt is great! I have also added a featured image.
@bph looks like the post has a number of formatting typos in it still; if it's needed I can go fix those formatting issues. I think there might be some conflicts that arose when copying from the Google document; some things are a bit confusing and don't make sense to me when reading them.
Apart from that I think the author of the post is me, but that's not right. I'd prefer we give credit where it's due 😉.
The way I see it, I'm happy to continue to provide some technical review/feedback, but this has transformed into a post that I'm not really an author or editor of, and I'm happy with that. I think I've provided all the technical review already in the Google document that is informative, so beyond what I've already shared there may not be much new I have to offer - I may just be repeating myself if I try to offer another round of review.
Noted that this is lacking from the pre-publish list, but is valuable anyway:
I'll take a run through it for the backticks.
I can understand your discomfort with something this florid. You should hear my internal dialogue when I'm writing my own CSS and JSON files, since I am absolutely irrational about things nobody else notices, like typography.
We went through the pre-publish checklist before... When you are ready, @marybaum 🚢 it :-)
And here is your post-publish checklist:
Adding a new item to the pre-publish checklist:
Note that I switched the social image to the default template. The chosen template + featured image made the post title really hard to read.
Discussed in https://github.com/WordPress/developer-blog-content/discussions/75