mapbox / secret-shield

Easily find secrets in files, directories, and repositories. Stop leaking secrets using git hooks.
MIT License
35 stars 13 forks source link

Fix code scanning alert - Useless regular-expression character escape #20

Closed IamGreut closed 2 years ago

IamGreut commented 2 years ago

Tracking issue for: utils/readfiles.js:26:

IamGreut commented 2 years ago

In a very long winded effort to understand the code, I've extracted the relevant code with my own comments to help follow the logic Disclaimer: it's not pretty, but explains each step out loud. The code is difficult to read this way so reading this side by side with the non-annotated code could be useful

// Build a filter that is an array 
function buildFilter(filters) {
  var filters = (filters instanceof Array) ? filters.slice() : [filters];
  var filterArray = [];

// Check filter is not an empty array
  if (filters.length === 0) return null;

// When filter has one or more indices follow the logic
  while(filters.length > 0) {

// Removes the first element from the array and returns that removed element
// This method changes the length of the array
    var filter = filters.shift();

/* 
filterArray.push adds one or more elements to the end of an array and returns the new length of the array
'\\/?' cast from string becomes regex / \/? /
which matches the / character while the ? character causes a match of the previous token (in this case /) between zero and one times, as many times as possible
filter.replace first regex argument matches the . character globally aka all instances
'\\.' cast from string becomes regex / \. / which matches any single character 

CONFUSION POINT - ?????? So that means the replace function is replacing all . characters with a . character ????? 

*/
    filterArray.push('\\/?' + filter.replace(/\./g, '\\.')
/*
1st Capturing Group (\*?) 
\* matches the character *
? matches the previous token between zero and one times, as many times as possible
2nd Capturing Group (\*)
\* matches the character * 
3rd Group - Negative Lookahead (?!\*)
Assert that the Regex does NOT match
the Regex being \* matching the character *

function arguments are match and prefix
CONFUSION POINT - ????? how are these arguments being passed in ????
*/
      .replace(/(\*?)(\*)(?!\*)/g, function(match, prefix) {

        if(prefix == '*') {
          return match;
        }
/*
Match a single character not present in the list below [^\\/]*
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
\\ matches the character \ 
/ matches the character / 
*/
        return '[^\\/]*';
      })
/*
\? matches the character ? globally aka all instances

Match a single character not present in the list below [^\\/]?
? matches the previous token between zero and one times, as many times as possible, giving back as needed (greedy)
\\ matches the character \ 
/ matches the character / 
*/
      .replace(/\?/g, '[^\\/]?')
/*
\* matches the character *
\* matches the character * globally aka all instances

\. matches the character . 
* matches the previous token between zero and unlimited times, as many times as possible, giving back as needed (greedy)
/*
      .replace(/\*\*/g, '\.*')
/*
1st Capturing Group ([\-\+\|])
Match a single character present in the list below [\-\+\|]
\- matches the character - 
\+ matches the character + 
\| matches the character | 

\\ matches the character \ 
$ asserts position at the end of the string
1 matches the character 1
*/
      .replace(/([\-\+\|])/g, '\\$1')
    );
  }
  return new RegExp('^' + filterArray.join('|') + '$', 'i');
}
IamGreut commented 2 years ago

Closed alert as False Positive This code is a fork of https://github.com/guatedude2/node-readfiles/ Submitted PR to request review of what we believe is just an extra backslash character that is unnecessary, but does not constitute a security issue.

Update: The author has accepted the changes and merged!

ThibaudLopez commented 2 years ago

We don't know the intent of the author, whether they meant zero or two backslashes; one backslash is certainly useless.

Their documentation:

filter: a string, or an array of strings of path expression that match the files being read (defaults to '**')

  • ? matches one character
  • * matches zero or more characters
  • ** matches zero or more 'directories' in a path

Based on that documentation, and from reading their source code, we think it should be zero backslashes, not two:

// likely intent
new RegExp('.*')
/.*/

// unlikely intent
new RegExp('\\.*')
/\.*/

So I agree with closing this as false positive.

ThibaudLopez commented 2 years ago

Fixed by author https://github.com/guatedude2/node-readfiles/pull/7