Fix parsing of incomplete/sparse RFC 2231 attributes

mplscorwin commented 4 years ago

Hi!

Would it make sense to open an issue explaining the problem that this is solving?

I wasn't able to understand the addition of sort for example.

Thanks much!

pali commented 4 years ago

I wasn't able to understand the addition of sort for example.

Hash is an unordered set. To reconstruct original value, it is needed to join hash keys in correct order. Therefore sorting keys of hash is required. Prior this patch, plain array was used (where are values always in order).

mplscorwin commented 4 years ago

I apologize it has taken me a few days to get back to you on this. I think you have responded to my evidence (of not understanding your motivation) without quite dignifying my query:

Can you please explain the purpose of this change, such as to justify moving away from the implicitly sorted structure?

mplscorwin commented 4 years ago

Chatted about this with simcop on IRC who first pointed to (2) of https://tools.ietf.org/html/rfc2231#section-3 and then suggested:

  the ordering is specified as part of the header attributes 
  [...] without the sort then an attribute could be reconstructed
  incorrectly [...] the test there [may not] correctly check for that

Adding,

  i'd update the test to check the one from the rfc there, and
  maybe also misorder it since it's technically allowed for that to happen
  according to an earlier part of the standard.  which is why you have to
  add numbers to the partial attributes like that

That makes sense to me, but it would be a few weeks before I could look at doing that. Dropping this back to (a) confirm we are on the same page and (b) see if you'd agree those tests could smooth the way for this PR and improve things more generally.

pali commented 3 years ago

Ok, this pull request is open for more than month without response, so here are all details:

This patch is fixing Denial of Service security issue in Email-MIME-ContentType library. Without this patch, it is possible to create input for parser which cause allocation of too much memory and crash of whole perl process as it would be killed by OOM killer. Example input which cause this issue is in test. You can that test patching library and you would see what happen, test would try to allocate too much memory and later would be either stopped or killed by kernel OOM killer.

I hope that distributions would include this patch to fix above security issue as this library may be already used for parsing emails and emails can contain any data...

pali commented 3 years ago

@mplscorwin: You are right that parts needs to be reassembled into correct order. Previously they were put into array, part X was put into index X, so independently of how they were ordered in email, they were correctly reassembled. Now this patch is changing usage of array to hash (due to above security issue) and has is unordered. If you are using array, you can read members in order defined by index. But if you are using hash you would get members in "random" order (not exaclty random, but in order defined by hash function). So to reconstruct correct order defined by hash keys, it is required to sort values by keys manually. That is reason for manual call of sort. @{$hashref}{(list of keys)} is special perl syntax how to return hash values in order defined by specified keys. So to have correct order of values, it is required to provide into this syntax sorted names of keys. And that perl code sort { $a <=> $b } keys %{$hashref} is doing it. Just to compare, perl syntax @{$arrayref} returns values of array in order 0, 1, 2, ..., N. So basically both constuctions are doing same thing, just one is working with array data structure and another with hash data structure.

pali commented 3 years ago

And about proposed test case where attributes are stored in email not in ascending order... that test make sense, I agree it can be usefull, but is not fully related to this issue. I can create a separate pull request for that test case.

rjbs commented 3 years ago

I have made a trial release and will make a stable release in two weeks.

rjbs commented 3 years ago

…and then I mis-operated git, had to do a second release, and made a stable release. I'm sure it'll be fine.

rjbs / Email-MIME-ContentType

Fix parsing of incomplete/sparse RFC 2231 attributes #13