Open Ducasse opened 4 years ago
running this test on mac, linux on travis is passing
testCharLf
self assert: Character lf asString equals: (String streamContents: [:s | s lf ])
running this test on mac, linux on travis is failing
testCharLf2 "this test is special because it embedds a string that contains a lf"
self assert: Character lf asString equals: "(String streamContents: [:s | s lf ])" '
'
S
No, new lines in source code are CRs, not LFs, this has always been the case.
' ' first = Character cr
=> true
On 8 Dec 2019, at 10:30, StéphaneDucasse notifications@github.com wrote:
running this test on mac, linux on travis is passing
testCharLf
self assert: Character lf asString equals: (String streamContents: [:s | s lf ])
running this test on mac, linux on travis is failing
testCharLf2 "this test is special because it embedds a string that contains a lf"
self assert: Character lf asString equals: "(String streamContents: [:s | s lf ])" '
'
S
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.
Please read what I wrote. Of course the editor/compiler encodes Enter as Character cr but this is clearly not what I meant.
as the second test is showing it if we programmatically have a lf in a string because for example you want to test that you generated correctly lf in the resulting string then only on travis (and not locally on my machine) this test is failing.
The second test is not produced with an enter key in the text editor but with (String streamContents: [:s | s lf ]) printString.
So do you see now the difference between the two tests? a Character lf produced but not stored in a string is handled well - first test. and if we convert by printing the string then it is not working. When you inspect that string it contains lf!
testCharLf2 "this test is special because it embedds a string that contains a lf"
| buggyButcontainingLF |
"(String streamContents: [:s | s lf ]) to produce the string buggyButcontainingLF below"
buggyButcontainingLF := '
'. self assert: (buggyButcontainingLF at: 1) equals: Character lf. self assert: Character lf asString equals: ' '
https://travis-ci.com/Ducasse/Descent/jobs/264819527
Travis CI enables your team to test and ship your apps with confidence. Easily sync your projects with Travis CI and you'll be testing your code in minutes.
On my machine this test passes perfectly but it fails on travis. May be this is an unzip problem but this is scary and strange.
it is still not 100% clear to me how you produce the problematic string.
you say 'do x to produce the string', but where ? in a browser on your machine ? how does is end up in travis CI ? via committed code ? Tonel code ?
If you enter an LF in source code, you get a 'method source contains linefeed' code critic, because this is a bad idea.
I assume that Tonel turns all line endings into CR when exporting.
See TonelWriter>>#splitMethodSource:into:
at the end it says
source upToEnd withLineEndings: self newLine
which basically does a conversion (but to a platform dependent one, CR on mac).
I can't immediately find what Tonel does when reading, but by then you already lost your LF
Now, the critic rule is there for a reason, it is a bad idea to depend on specific line endings in literal strings.
When I edited the code while writing my test I just printed the expression and use the result. Explain me then how I can test a pretty printer that produces lf and not cr. Yes I can use Character lf asString and concatenate little strings. I can also recode the pretty printer manually. But to me this is bug! the end of line inside strings should not changed magically. I should be able to code towards a specific platform. Then there is still a problem because my "bad" code is working on my machine when I load my code on Mac OS X but it is not working on Mac OS X on travis.
And here I'm not talking about code of the method but encoding inside strings inside a method. So to me encoding inside strings should not be modified.
Guille pablo I would like to discuss with you next time we see each others.
Normally I loaded these tests (because I'm working on another project) so I do not see the difference but I'm trying again with a fresjh 8
But I was wrong. Apparently testLf2 is wrong on my machine and I prefer this. So indeed Tonel is probably changing the code and to me this is not good.
On 8 Dec 2019, at 12:41, StéphaneDucasse notifications@github.com wrote:
Explain me then how I can test a pretty printer that produces lf and not cr.
You failed to mention this.
Have a look at STONWriter>>#newLine: and its senders in STON tests.
It is perfectly possible to generate textual output that uses different line end conventions and still write tests for them. I do this for NeoCSV, NeoJSON and others.
I don't see the issue: you should not depend on newlines being preserved inside strings in source code - formatters can/will change them, Tonel does, maybe it happens in other places. And one day, we will probably have to replace CR in the whole image anyway (because it is no longer the default, even on macOS where it came from).
So basically, from everything you wrote, I see no issue. Though I admit is might be surprising, but there is the code critic rule to help you.
BTW, here is another point:
You can use #lines to parse any string into lines, regardless of the line end convention used.
So basically you could do something like
String lf join: (ZnClient>>#execute) sourceCode lines.
this gives you LF source code instead of CR, regardless of what was used originally.
HTH
I understand your point. In general this is better to rely on explicit behavior versus implicit encoding inside a string. I changed my tests already to handle this. What was super strange is that I loaded my code and my tests where green, then I modified them to get explicit and to build hypotheses on why it was breaking on travis. Now I do not get why my tests did not break the first time. I will reload an old version to check. Because loading the specific tests I wrote to identify the problem break (hopefully) on my machine.
I checked and loading an old version is breaking too so this is good. Now while I understand that explicit is better. I would prefer that the code encodings and in particular the string encodings is not changed during fileout. I prefer garbage in, garbage out. But may be I wrong. For pillar we were discussing with damien that you may want to validate the output of windows encodings but running on a unix system.
I read
testConvertingNewLines | input result output | input := '''line ending with CR', String cr, 'line ending with LF', String lf, 'line ending with CRLF', String crlf, ''''. output := 'line ending with CR', String crlf, 'line ending with LF', String crlf, 'line ending with CRLF', String crlf. result := (STON reader on: input readStream) newLine: String crlf; convertNewLines: true; next. self assert: result equals: output. output := 'line ending with CR', String cr, 'line ending with LF', String cr, 'line ending with CRLF', String cr. result := (STON reader on: input readStream) newLine: String cr; convertNewLines: true; next. self assert: result equals: output
and yes I was doing the same once I understood the problem.
String lf join: (ZnClient>>#execute) sourceCode lines. is nice now since I'm explicitly using lf in my pretty printer, I'm just emitting lf
splitMethodSource: aMethodDefinition into: aBlock | keywords source declaration |
keywords := aMethodDefinition selector keywords.
source := aMethodDefinition source readStream.
"Skip spaces"
(source peek isSeparator) ifTrue: [ self skipSeparators: source ].
"Skip comments"
(source peek = $") ifTrue: [ self skipComment: source ].
"Parse declaration"
declaration := String new writeStream.
[ (self selectorIsComplete: keywords in: declaration originalContents) not
or: [ ':+-/\*~<>=@,%|&?!' includes: declaration contents trimRight last ] ]
whileTrue: [
"stop infinite loop if no match was found"
source atEnd ifTrue: [ TonelWriteError signal: 'Cannot find selector in source for ', aMethodDefinition asString ].
"get separators"
[ source atEnd not and: [ source peek isSeparator ] ]
whileTrue: [ declaration nextPut: source next ].
"take next word until we find a separator or a dot"
[ source atEnd not and: [ source peek ~= $. and: [ source peek isSeparator not ] ] ]
whileTrue: [ declaration nextPut: source next ] ].
aBlock
value: (declaration contents trimLeft withLineEndings: self newLine)
value: (source upToEnd withLineEndings: self newLine)
I do not see where strings are converted too. But now I understand that this is an corner case and I prefer that. I should have reloaded my code first.
Sometimes I would really like to have a mode where we can see invisible character in strings and in the code. It would make things simpler to understand.
Anyway thanks for your time.
There is even a rule in the code smell detector which will warn against line feed inside the code (RBMethodSourceContainsLinefeedsRule
). From the above discussion it would seem to be obsolete.
I do not think. This is not because there is a rule that we should not think that it is strange to change string (and not source code) encodings.
This issue has either a default title or empty body. We would appreciate it if you could provide more information. Note: I am not a very intelligent bot, I can only react to new comments. Please add a comment for me if you update the body or title.