Closed vaurdan closed 5 years ago
Thanks for the review! I'm going to start addressing these issues.
@mjangda I have addressed most of the issues on your feedback.
Regarding the SQL pagination, after some testing, I noticed that it is slower and more unreliable than fetching all the posts at once. Sometimes queries would fail silently, fooling the script to think that the replacement is done.
Also, given that I need to get the total number of posts (for progress report), instead of using SQL_CALC_FOUND_ROWS
and multiple queries for every 100 posts, I believe that one single query returning all the posts will be better. So far, on every occasion this script was used, it was fast and flawless, despite the limitless query. If we start seeing issues with this query, we can obviously iterate and add some pagination. What are your thoughts on that?
@vaurdan I think this makes sense. It's possible that we could run out of memory or hit a killed query, but at least this command should be able to restart where it left off when restarted.
Should we add a dry run, so that we can see if the primary SQL query will die? Maybe we can test a dry run on a giant db?
In order to avoid running duplicative migration steps many times (every time we refresh content), I'm thinking we might be able to use the shortcode when we're pre-launch and run this as cleanup either during the launch or even post-launch.
In that case, we'd have the protected embeds table local to the site. So we could pull embeds directly from that table instead of generating a CSV.
We would probably want an additional script to remove the protected embeds table after we've verified that the script finished successfully.
Thoughts?
Also: What happens if someone edits a post with an iframe/script/etc who doesn't have proper permissions? I would have thought any "unsafe" embeds would get removed.
We would probably want an additional script to remove the protected embeds table after we've verified that the script finished successfully.
Maybe we could add this as a flag to this CLI? --remove-table-if-empty
?
What happens if someone edits a post with an iframe/script/etc who doesn't have proper permissions? I would have thought any "unsafe" embeds would get removed.
I think that's a valid concern. One idea I've had splashing around in my head was to convert these to a Gutenberg embed block, and offer proper editing that way? I'm not sure of another good option that would save embeds for users without the proper capability.
Maybe we could add this as a flag to this CLI?
The main thing I like about having two commands is that we can run the first one, then do some kind of verification (diff the generated HTML maybe?), and then run the table cleanup.
I think that's a valid concern. One idea I've had splashing around in my head was to convert these to a Gutenberg embed block, and offer proper editing that way? I'm not sure of another good option that would save embeds for users without the proper capability.
I think before we spend too much more time on converting these, we need to come up with a way to ensure old posts aren't going to break if they're updated by someone who doesn't have the unfiltered_html
cap.
I think before we spend too much more time on converting these, we need to come up with a way to ensure old posts aren't going to break if they're updated by someone who doesn't have the
unfiltered_html
cap.
Could we wrap the reversed embeds in a HTML comment or something, and capture it via the pre_kses
filter? Just thinking out loud without testing yet.
Closing this per our call in favor of #23 and #24.
This command will parse and render every single
protected-iframe
shortcode present on the site.It requires a CSV file, that can be generated on the original site, with all the Protected Embeds.