SeasideSt / Seaside

The framework for developing sophisticated web applications in Smalltalk.
MIT License
519 stars 71 forks source link

#readHexFrom:errorDetail: should do an #asUppercase #1283

Closed marianopeck closed 3 years ago

marianopeck commented 3 years ago

If you look at the 3 senders of readHexFrom:errorDetail: you can notice that before Seaside 3.4.5 they were doing something like this:

    firstByte :=  input atEnd ifTrue: [ WAInvalidUrlSyntaxError signal: aString ] ifFalse: [ input next asUppercase digitValue ].
    secondByte :=  input atEnd ifTrue: [ WAInvalidUrlSyntaxError signal: aString ] ifFalse: [ input next asUppercase digitValue ].

(notice the asUppercase).

However, starting in Seaside 3.4.5 they now look like this:

    firstByte :=  self readHexFrom: input errorDetail: aString.
    secondByte :=  self readHexFrom: input errorDetail: aString.

However, if you look at the method:

readHexFrom: aStream errorDetail: aString
    | hexChar |
    aStream atEnd ifTrue: [
        WAInvalidUrlSyntaxError signal: aString ].
    hexChar := aStream next.
    ((hexChar between: $0 and: $9)
        or: [ (hexChar between: $a and: $f)
        or: [ hexChar between: $A and: $F ] ]) ifFalse: [ WAInvalidUrlSyntaxError signal: aString ].
    ^ hexChar digitValue

It's doing hexChar digitValue instead of hexChar asUppercase digitValue. This causes testDecodedWith to fail on VAST. While adding the asUppercase make it pass.

theseion commented 3 years ago

In Pharo #digitValue answers the same number, regardless of casing. This is based on a pre-constructed table:

initializeDigitValues
    "Initialize the well known digit value of ascii characters.
    Note that the DigitValues table is 1-based while ascii values are 0-based, thus the offset+1."
    DigitValues := Array new: 256 withAll: -1.
    "the digits"
    0 to: 9 do: [:i | DigitValues at: 48 + i + 1 put: i].
     "the uppercase letters"
    10 to: 35 do: [:i | DigitValues at: 55 + i + 1 put: i].
    "the lowercase letters"
    10 to: 35 do: [:i | DigitValues at: 87 + i + 1 put: i]

What does #digitValue in VAST look like?

marianopeck commented 3 years ago

Hi @theseion In VAST it's case specific:

digitValue

    "Answer an Integer corresponding to the numerical radix of
     the receiver. Return 0-9 if the receiver is $0-$9, and
     10-35 if it is $A-$Z; otherwise return -1.
     DigitValues is initialized to contain the character value + 1
     of the character if the character is a digit and 0 otherwise. "

    | value |
    ((value := self value) between: 1 and: 255) ifTrue: [
        ^(DigitValues at: value) - 1].
    ^-1
initialize

    "DigitValues is a byteArray at which the digitValue + 1
    for the character is stored at the same index as the character value.
    Non-digits have zero stored at the character value."

    DigitValues := ByteArray new: 256.
    '0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ' doWithIndex: [ :char :index |  "$NON-NLS$"
        DigitValues at: char value put: index]. 

So...for example, $a digitValue answers -1 while $A digitValue answers 10.

theseion commented 3 years ago

Looks to me like we should move the logic for this to Grease. What do you think @jbrichau?

jbrichau commented 3 years ago

I'm inclined to accept @marianopeck 's pull request since the implementation was doing asUppercase before the fix for https://github.com/SeasideSt/Seaside/issues/1157 and it's not breaking any tests.

Moving it to Grease would assume we make it uniform across all platforms, probably by doing an asUppercase. We also have a test in Grease that only tests for the uppercase characters: https://github.com/SeasideSt/Grease/blob/master/repository/Grease-Tests-Core.package/GRPlatformTest.class/instance/testDigitValue.st

Am I missing something?

theseion commented 3 years ago

You're right, it's probably a good idea to accept the PR. I don't like the use of asUpperCase because the casing should have nothing to do with the value. In addition, it would save a couple of instructions to perform a lookup instead of doing a conversion first, but the gains are probably negligible in most cases.

Since sending asUpperCase is at least encapsulated in a method I think we can accept the PR.

jbrichau commented 3 years ago

Again, thanks @marianopeck !

marianopeck commented 3 years ago

Thanks to you for taking a look a merging!