pods-framework / pods

The Pods Framework is a Content Development Framework for WordPress - It lets you create and extend content types that can be used for any project. Add fields of various types we've built in, or add your own with custom inputs, you have total control.
https://pods.io/
GNU General Public License v2.0
1.07k stars 264 forks source link

Field data truncated when using < #6414

Open mircobabini opened 2 years ago

mircobabini commented 2 years ago

Description

On a simple input meta, when saving data like "abcd<123", the last part is lost => "abcd".

This is because of the < symbol.

Version

2.8.8.1

Testing Instructions

  1. Create a simple input text meta
  2. Write the demo text and save

Screenshots / Screencast

No response

Possible Workaround

No response

Site Health Information

WP latest, only Pods.

Pods Package

No response

sc0ttkclark commented 2 years ago

From the PHP documentation on removing HTML tags at https://www.php.net/manual/en/function.strip-tags.php

Warning: Because strip_tags() does not actually validate the HTML, partial or broken tags can result in the removal of more text/data than expected.

This is the function we use for removing HTML if HTML is not allowed for the text field (there's an option for that).

Unfortunately this will remove malformed-looking HTML tags like <123 too.

There's not a great workaround for this at the moment unless you enable HTML for the field and then esc_html( $value ) before outputting that value onto your page.

I'll continue to look into this but if anyone has any other ideas on a solution here, I'm open to them.

cc @JoryHogeveen @pdclark

mircobabini commented 2 years ago

Thanks for the workaround (enabling HTML for that field).

From my perspective strip_tags should not act the way it does, it's a buggy behaviour. Btw it's intended to be used with HTML and having "<" unencoded in HTML is essentially wrong, it should be "<".

Probably using strip_tags to remove tags in a scope where html is not enabled is not a big deal, because "<" is a valid char in our context.

What about this:

function strip_valid_tags( $str ) {
    $str = preg_replace( '/<([^>]*(<|$))/', '&lt;$1', $str );

    return html_entity_decode(
        strip_tags( $str ),
        ENT_NOQUOTES,
        'UTF-8'
    );
}

$str = "<p>I am currently <30 years old.</p>";
echo strip_valid_tags( $str );
// => I am currently <30 years old.

Credits: https://stackoverflow.com/a/38022499/1160173

It doesn't solve the issue 100% by the way, a case like "asd<1.2.3.>dfg" will still become "asddfg".

JoryHogeveen commented 2 years ago

Related: #6107

Same issue reported here: https://wordpress.org/support/topic/caracter-cut-metadata/

sc0ttkclark commented 2 years ago

html_entity_decode() in this use case will expose other HTML that was submitted as encoded already. This one is going to take more thought and review on how best to address these cases.

The suggested code also only covers cases where there is an HTML tag after the usage of the lone < which wouldn't be a complete solution there.