Closed dram closed 2 years ago
The problem is caused by CompiledMethod>>#getPreambleFrom:at:
, which does not support non-ASCII characters, as already stated in the comment.
Submit a patch to The Inbox: https://lists.squeakfoundation.org/pipermail/squeak-dev/2022-May/220279.html
BTW, as in Squeak 6.0 sources and changes files are always UTF-8 encoded, it seems not necessary to append language tags to chunks anymore, e.g.:
!测试类 methodsFor: 'as yet unclassified' stamp: 'xw 5/4/2022 09:08'!]lang[(3 61)6,0!
Seems that there is some problem about subscribing to squeak-dev mailing list, I will reply to the message[1] here:
Merged. Please add commentary to magic numbers such as "2" the next time.
Thanks! Will do the next time.
I also noticed that @LinqLover reported an issue[2] which is related to CompiledMethod>>#timeStamp
. It may be caused by my modification to CompiledMethod>>#getPreambleFrom:at:
, but currently I have no clue about this problem, will do some more investigation.
Regarding to the magic number 2
, endPosition
argument passed into CompiledMethod>>#getPreambleFrom:at:
is pointed to position before the last character of the preamble, so in order to make use of PositionableStream>>#backChunk
, this position is advanced in two characters, right after the !
character.
[1] https://lists.squeakfoundation.org/pipermail/squeak-dev/2022-May/220281.html [2] https://lists.squeakfoundation.org/pipermail/squeak-dev/2022-May/220311.html
The root cause of CompressedSources
's incomplete preamble issue is that implementation of CompressedSourceStream
is not compatible with PositionableStream>>#skip:
.
Recently commited new implementation of CompiledMethod>>#getPreambleFrom:at:
relies on PositionableStream>>#backChunk
, which indirectly relies on PositionableStream>>#skip:
.
There are two possible solutions, 1. update PositionableStream>>#skip:
, 2. add override method CompressedSourceStream>>#skip:
. The code is same, instead of direct access of instance variable position
, accessor method is used instead, i.e.:
skip: anInteger
self position: self position + anInteger
Submit a patch to The Inbox https://lists.squeakfoundation.org/pipermail/squeak-dev/2022-May/220335.html
Genious! Thank you, Xin, I confirm that with Compression-xw.63, my image can be updated again with cached source files. :-)
Submit another patch related to timestamp: https://lists.squeakfoundation.org/pipermail/squeak-dev/2022-May/220396.html
Merged. Thank you!
BTW, as in Squeak 6.0 sources and changes files are always UTF-8 encoded, it seems not necessary to append language tags to chunks anymore, e.g.:
Language tags allow for language-specific text composition where Unicode's Han Unification is too generic. Yes, I think we might drop the ]lang[
tag and the concept of having a leadingChar
in the future. As long as we have leadingChar
in the image, it makes no sense to drop the support of ]lang[
tags.
See the class comment of Character
.
Thanks for pointing to comment of Character
, very detailed!
I find leadingChar
quite confusing in some scenarios, e.g.:
For the first example, one of those two 测试
strings is input from keyboard, and the other pasted from clipboard, they seems to be equal, but result is false
, because of leading char.
The second example, I try to fetch some text from Web, and search for wide string patterns, also one from keyboard and the other from clipboard, which will lead to different results.
It is quite frustrated if strings in code contains leading chars, and they are used to comparing outside texts, even find ...
in text editor will not work, not to mention that it's even worse if wide characters are contained in class and method names.
It seems that there are not enough time for considering remove of leadingChar
in Squeak 6.0, but I still hope that at least we can disable code related to leading char in InputInterpreter
.
I write a proposal regarding to InputInterpreter: https://github.com/squeak-smalltalk/squeak-object-memory/issues/18
In Squeak 6.0 alpha, For classes which have unicode names, browser will be failed to get timestamp for their methods, only
no timeStamp
is displayed, e.g.: