Closed ZornsLemma closed 6 years ago
I'm actually torn about this. Personally, I'm Ok with it as an optimization artifact. In fact, there is another subtle issue when using byte values as FOR/NEXT iterators:
byte i
for i = 0 to 1000
puti(i)
next
actually iterates the proper number of times. The reason is that the iteration count is left on the stack during the FOR/NEXT and is tested for the termination value, not the value of i
in memory.
Instead of disabling it altogether, it could be one of the optimizations that gets a documented caveat. Perhaps making it part of the try_dupify() optimizations when -O2 is passed in.
I'm pretty torn myself. It seems a shame to leave this trap lying around for future developers, but if it's documented it's probably not such a huge problem - I was pretty surprised by this and was wondering why my code wasn't working, but I hadn't had any warning from the documentation. It also seems a shame to lose the optimisation or to force DLB to clear the high byte - the 'for' loop example is both interesting and useful, if it's documented as behaviour which can be relied on.
I've been hoping I'd come up with a fourth option but so far it has eluded me...
(I think I'd missed the fact that the compiler now has multiple optimisation levels - is this just the self-hosted compiler? I guess the cross-compiler could be modified to have this too. That does make it a feel a bit better to include a 'risky' optimisation.)
We can try clearing the MSB of the eval stack for DAB/DLB. I never documented the FOR/NEXT behavior of byte values. The more I think about it, the more Ok I am with clearing the MSB.
Oh yeah, I forgot I only added the -O2 to the self hosted compiler. It includes the try_dupify() in the optimization passes. I can add it to the cross-compiler too.
If you're not worried about losing the FOR/NEXT behaviour then I think clearing the MSB in DAB/DLB is a good solution. The only thing that slightly concerns me - and I haven't tried to test this - is what would happen with a loop like:
byte i
for i = 0 to 255
// do stuff
next
Would that fail due to wrapping with the proposed truncation, maybe turning into an infinite loop? I think a case could be made that this isn't unreasonable behaviour anyway; it's at least a bit more obvious than the case where the optimiser happens to preserve the MSB for you unexpectedly.
I'm fairly sure that will work. I'll make the change to the devel branch (I'm quite sure it will work there) and look for any anomalies.
I added the clear MSB for DLB/DAB in the devel branch and only found one instance of bad code - in ROGUE of all places. A FOR/NEXT that used a byte variable but iterated through negative values.
In master, I found some problems with FOR x = 100 DOWNTO 0 cases. That seems like a useful instance. Should I check in the fix to master? I'm thinking we just fix this in the devel branch.
Thanks Dave, I'm quite happy for this to just go to devel. I've moved my own development work over to a branch off your devel branch, and I don't think the Acorn port has (m)any actual users so I'm not planning to push out an update of the previous release with this change in. It also seems safer to make a change to the VM behaviour only in the new 2.0 version.
Ok, I'm closing this out as fixed in the devel branch.
If I tweak hello.pla to look like this:
and then run it using the portable VM:
note that it outputs 256, not 0 , even though foo is a byte variable. The generated code looks like this:
I am not sure what's the right fix:
Let me know what you think; if you'd rather change the optimiser I'm happy to try preparing a pull request for that.