openlab-at-city-tech / webworkqa

WeBWorK integration for WordPress and BuddyPress
GNU General Public License v2.0
4 stars 2 forks source link

Verify that notifyAddresses is not over-sanitized #196

Closed boonebgorges closed 3 years ago

boonebgorges commented 3 years ago

During the round of security fixes requested by wordpress.org, I added sanitize_text_field() processing when getting the notifyAddresses from WW. Because the address could be formatted Boone Gorges <boone@example.com>, the part in brackets may be getting stripped because it looks like an HTML tag to sanitize_text_field(). This needs confirmation and fixing.

boonebgorges commented 3 years ago

@drdrew42 Has there been a change in the way that the notifyAddresses key is built? When I submit from http://146.111.135.122/webwork2/WW-Dev-Fresh/Functions_-_Difference_Quotient/2/?effectiveUser=studentB, I see the following POST payload:

Array
(
    [user] => studentB
    [effectiveUser] => studentB
    [key] => sLa8XspgFThSps5rW84nD6qdaiLTxxlc
    [courseId] => WW-Dev-Fresh
    [set] => Functions_-_Difference_Quotient
    [studentName] => Student B
    [notifyAddresses] => =?UTF-8?B?QWRtaW5pc3RyYXRvcg==?= ;=?UTF-8?B?UHJvZi4gSW5zdHJ1Y3Rvcg==?= ;=?UTF-8?B?VG9kZCBTdGFtYmF1Z2g=?= ;=?UTF-8?B?Qm9vbmUgR29yZ2Vz?= 
    [pg_object] => [...]
    [module] => WeBWorK::ContentGenerator::Problem
    [showHints] => 1
    [displayMode] => MathJax
    [showSolutions] => 1
    [randomSeed] => 893
    [showOldAnswers] => 1
    [emailableURL] => http://146.111.135.122/webwork2/WW-Dev-Fresh/Functions_-_Difference_Quotient/2/?effectiveUser=studentB
    [problem] => 2
    [showCorrectAnswers] => 0
    [problemPath] => CUNY/CityTech/Precalculus/setDifference_Quotient/diff-quot-quad.pg
    [feedbackForm] => Ask For Help
)
drdrew42 commented 3 years ago

I see the following comment -- though git blame says this change was made 2 years ago...

# 2019 rfc822_mailbox was modified for UTF-8 support:
#   If the full_name is set it will use the RFC 2047 "MIME-Header" encoding
#   for the full_name, so that UTF-8 characters can be "sent" via the
#   permitted ASCII encoding.
boonebgorges commented 3 years ago

Does this means that the address is encoded via some reversible algorithm? Can you link me to the changeset or to the relevant part of the WW codebase? I'll need to detect decoded addresses and then decode them, and it will help me if I can see exactly how the encoding is taking place.

drdrew42 commented 3 years ago

Yes, the comments indicate that emails are now encoded using the RFC2047 standard.

e.g. I found https://mail-mime-parser.org/ which will recognize the encoding and parse it for you.

drdrew42 commented 3 years ago

Maybe this is more appropriate? https://www.php.net/manual/de/function.iconv-mime-decode.php

boonebgorges commented 3 years ago

Thanks @drdrew42. I've put some changes in place that decodes the notifyAddresses value.

The value that I get from http://146.111.135.122/webwork2/WW-Dev-Fresh/Functions_-_Difference_Quotient/2/?effectiveUser=studentB doesn't actually have any email addresses - the decoded value is:

Administrator ;Prof. Instructor ;Todd Stambaugh ;Boone Gorges 

Is that particular instance of WW set up in such a way that the users don't have associated email addresses? Or is something happening elsewhere in the system that I should be debugging?

drdrew42 commented 3 years ago

Let me dig in, I think the issue must be on the ww2 end -- it looks like the email addresses themselves are not being encoded. They are definitely present in the database for each of these users...

drdrew42 commented 3 years ago

@boonebgorges are you sure that perhaps the <address@host.com> isn't being stripped during parsing? It does look like an HTML tag... I'm seeing that the address is included in the raw POST data:

array(19) {
...
  | ["notifyAddresses"]=>
  | string(224) "=?UTF-8?B?QWRtaW5pc3RyYXRvcg==?= <xxx@xxx.cuny.edu>;=?UTF-8?B?UHJvZi4gSW5zdHJ1Y3Rvcg==?= <prof@instructor.edu>;=?UTF-8?B?VG9kZCBTdGFtYmF1Z2g=?= <xxx@xxx.cuny.edu>;=?UTF-8?B?Qm9vbmUgR29yZ2Vz?= <boone@xxx>"
  | ["set"]=>
  | string(22) "Polynomials_-_Division"
  | ["courseId"]=>
  | string(6) "WW-Dev"
  | ["problem"]=>
  | string(1) "1"
  | ["feedbackForm"]=>
  | string(12) "Ask For Help"
  | }
boonebgorges commented 3 years ago

It's not being stripped at the WP application level - I've tried dumping the POST data immediately when the URL is hit, bypassing all of WordPress, and I'm seeing the same thing:

...
   [set] => Functions_-_Difference_Quotient
    [showSolutions] => 1
    [notifyAddresses] => =?UTF-8?B?QWRtaW5pc3RyYXRvcg==?= ;=?UTF-8?B?UHJvZi4gSW5zdHJ1Y3Rvcg==?= ;=?UTF-8?B?VG9kZCBTdGFtYmF1Z2g=?= ;=?UTF-8?B?Qm9vbmUgR29yZ2Vz?= 
    [feedbackForm] => Ask For Help

Are we certain that we're testing with the very same WeBWorK version? I'm coming from this URL, whose 'set' doesn't appear to match yours: http://146.111.135.122/webwork2/WW-Dev-Fresh/Functions_-_Difference_Quotient/2/

drdrew42 commented 3 years ago

Switch over to http://146.111.135.122/webwork2/WW-Dev

I have redirected the 'ask for help' (for WW-Dev course only, not -Fresh or -Deux) to a php script that dumps the request data. Click 'ask for help' and inspect the page source to see that the addresses are present.

boonebgorges commented 3 years ago

From http://146.111.135.122/webwork2/WW-Dev/Functions_-_Difference_Quotient/4/?effectiveUser=studentB, I see:

 ["showSolutions"]=>
  string(1) "1"
  ["notifyAddresses"]=>
  string(224) "=?UTF-8?B?QWRtaW5pc3RyYXRvcg==?= ;=?UTF-8?B?UHJvZi4gSW5zdHJ1Y3Rvcg==?= ;=?UTF-8?B?VG9kZCBTdGFtYmF1Z2g=?= ;=?UTF-8?B?Qm9vbmUgR29yZ2Vz?= "
  ["showOldAnswers"]=>
  string(1) "1"

Looks like the addresses aren't here either.

drdrew42 commented 3 years ago

Did you inspect the source? When rendered in the browser, the <...> tags aren't visible.

boonebgorges commented 3 years ago

Ah, you are right. Sorry about that. I'll continue to investigate on my end.

boonebgorges commented 3 years ago

I've run another set of tests and can confirm that email addresses are no longer being stripped. See bb504f8.