pnp / pnpframework

PnP Framework is a .NET library targeting Microsoft 365 containing the PnP Provisioning engine and a ton of other useful extensions
https://pnp.github.io/pnpframework/
MIT License
211 stars 145 forks source link

Bug: Token Regex for fileuniqueid* matches too broadly #751

Open heinrich-ulbricht opened 2 years ago

heinrich-ulbricht commented 2 years ago

This Regex matches more than just fileuniqueid* tokens:

https://github.com/pnp/pnpframework/blob/83eaf1f88370b8fd35245a3ce2a01272178f4869/src/lib/PnP.Framework/Provisioning/ObjectHandlers/Utilities/ListItemUtilities.cs#L283

image

This will be handled OK since exceptions are swallowed, but calls are made that are not necessary (for every field content containing curly braces). So this Regex could be improved to not trigger those redundant calls.

Alternative easy improvement: after this line: https://github.com/pnp/pnpframework/blob/83eaf1f88370b8fd35245a3ce2a01272178f4869/src/lib/PnP.Framework/Provisioning/ObjectHandlers/Utilities/ListItemUtilities.cs#L308

Add:

if (tokenParts.Length < 2)
{
    continue;
}

This catches most mismatches and prevents further redundant calls.

heinrich-ulbricht commented 2 years ago

Ramping this up to bug as I got a case where some content was matched by the RegEx and tokenParts[1] was an empty string. This causes the ID of a folder to be loaded here:

https://github.com/pnp/pnpframework/blob/ab5fa531b0afc30e280f3d653e12f44c29b4073d/src/lib/PnP.Framework/Provisioning/ObjectHandlers/Utilities/ListItemUtilities.cs#L323-L325

But the invalid tokenParts then throw here with invalid pattern exception:

https://github.com/pnp/pnpframework/blob/ab5fa531b0afc30e280f3d653e12f44c29b4073d/src/lib/PnP.Framework/Provisioning/ObjectHandlers/Utilities/ListItemUtilities.cs#L340

Proposed fix:

https://github.com/WikiTransformationProject/pnpframework/commit/277dd273ccfe43f7632ef0764697337bc16f371d

carlosmiguelsns commented 3 weeks ago

I noticed that there is a RegEx responsible for capturing tokens in the UpdateListItem process. However, if the list item contains a JSON, the existing RegEx does not validate correctly.

There could probably be a rewrite of the regex here or in the next process of the condition to exclude tokens that start with a quotation mark or single quote, because it identifies a JSON property.

However, I've made an improvement to the RegEx:

Regex fileUniqueIdToken = new Regex("(?<token>\\{\\{(?!\\\")[^{}]+?:[^{}]+?\\}\\}|(?<!\\{)\\{(?!\\\")[^{}]+?:[^{}]+?\\}(?!\\}))", RegexOptions.IgnoreCase | RegexOptions.Multiline | RegexOptions.Compiled);

Matching - all tokens that containst at list :

I'm not specifying fileuniqueid or fileuniqueidencoded because it's implicit in the RegEx.

Can someone please validate if there is a match that is not being adressed?

If this is the right solution, I can do a PR of my code