SeasideSt / Seaside

The framework for developing sophisticated web applications in Smalltalk.
MIT License
519 stars 71 forks source link

WAFileMetadataLibrary subclasses assumes ASCII files when importing CSS/JS #1273

Closed astares closed 3 years ago

astares commented 3 years ago

Sanjay Minni reports on Pharo Discord #seaside channel that he downloaded the 5.1 bundle of Bootstrap (including CSS and JS files) and while importing on a subclass of WAFileMetadataLibrary he received an encoding error.

image

To reproduce extract https://github.com/twbs/bootstrap/releases/download/v5.1.0/bootstrap-5.1.0-dist.zip into a folder. Create a new subclass of WAFileMetadataLibrary :

WAFileMetadataLibrary subclass: #MyFileLibrary
    instanceVariableNames: ''
    classVariableNames: ''
    package: 'Sample-Package'

and import using

MyFileLibrary recursivelyAddAllFilesIn: 'C:\temp\bootstrap-5.1.0-dist'

This will crash at position 9035 of the file stream with a character value 226

image

astares commented 3 years ago

I guess the problem is that the platform files assume that the encoding of the files is ASCII while todays CSS and JS files are typically written as UTF-8 encoded files.

image

but Seaside assumes ASCII encoding

image

If the method GRPharoPlatform>>#readFileStreamOn:do:binary: is temporarily patched as

readFileStreamOn: aString do: aBlock binary: aBoolean

    ^ aBoolean
        ifTrue: [ aString asFileReference binaryReadStreamDo: aBlock ]
        ifFalse: [ aString asFileReference readStreamEncoded: 'utf-8' do: aBlock ]

it works without a problem.

jbrichau commented 3 years ago

Thanks for reporting. I guess it's safe to classify ccs files as utf8. This is not by patching that method but by asserting that utf8 is the encoding for the css mimetype. I'm currently starting a holiday so I don't know when I will be able to get to this, but that's the way to go.

jbrichau commented 3 years ago

On second thought, because utf-8 is actually backwards compatible with ascii, it does make sense to just replace the 'ascii' encoding with 'utf-8' encoding in that method.

Transferred the issue to Grease: https://github.com/SeasideSt/Grease/issues/122

jbrichau commented 3 years ago

So, I discovered that although the code in Grease has always stipulated 'ascii' encoding, the underlying implementation in Pharo prior to Grease 1.6.0 did use utf-8 encoding for text files. So, this is a regression since Grease 1.6.0 where we moved to the new filestream api in Pharo.

See https://github.com/SeasideSt/Grease/issues/122

A new Grease bugfix release is almost ready to be published.

jbrichau commented 3 years ago

Fixed in Grease 1.7.3 See https://github.com/SeasideSt/Grease/releases/tag/v1.7.3