Python34 - Githubissues

juanitogan commented 7 years ago

First task was to get this running on Cygwin (and Babun) which is currently limited to Python 3.4. Thus, bytes().hex() had to be replaced with something equivalent. This is the source of issue https://github.com/dansanderson/picotool/issues/9 . That issue can be closed if you state the requirement as Python 3.5+ or adopt my alternative (which might also allow for earlier than 3.4, but not Python 2.x -- you are way beyond Python 2).

The rest is pretty well explained by the code comments.

I just today made some extra effort the get the glyph handling compatible with all scenarios -- which is more than what I had before to fit just my needs and some variants. That was pretty tough to chase out the settings for without switching to binary mode. Python needs a raw 8-bit ASCII+ text mode. raw_unicode_escape almost worked but it bombed on a "...\U..." I happened to have in a comment (good thing too because I almost went with it).

Regarding the edge cases with minifying arithmetic assignments, this from my notes might help:

a={1}b=a[1]b=3  -- works
a={1}b=a[1]b+=3  -- boom!  unexpected symbol near '='
a={1}b=a[1] b+=3  -- works
a={1}b=(a[1])b+=3  -- works
a={1}b+=a[1] b+=3  -- boom!  unexpected symbol near '='
a={1}b+=(a[1])b+=3  -- boom!  unexpected symbol near '='
a={1} b+=(a[1])b+=3  -- boom!  unexpected symbol near '='
a={1} b+=(a[1]) b+=3  -- works
a={1} b+=a[1]b=3  -- boom!  unexpected symbol near '='

I know you can't merge the whole shebang because I didn't write in proper optional handling of the features I added. I only quick-modded to getluamin and build --lua working with .p8 files for automating builds from Sublime Text.

dansanderson commented 7 years ago

Thanks very much for this juanitogan. I've since made the following changes related to this PR:

I implemented your Py3.4-compatible hex but as a utility function, documented as such.
I merged the new built-ins you added.
I added support for glyphs in a drastically different way: I converted all handling of Lua code to bytestrings. This is a large breaking change, but I think it's the best solution. P8SCII is a custom encoding, and if Pico-8 is going to treat Lua code as bytes then I want to do the same. (I do like your method of relying on the encoder error handler.)

I'm not going to merge the minification changes because I'd rather do that in a different place in the code. I do think you're right that the LuaMinifyTokenWriter is not structured correctly to encapsulate the solution to this problem, but I'd like to find a way to fix it near the AST traversal code.

As for const interpolation, kudos to you for your solution. I think for picotool's base impl I'm going to pursue the technique I described in the forum to leverage the AST for identification of consts, substitution of values, and compile-time evaluation of literal + const expressions.

It'd be cool to structure the tool API so you can implement your changes as plugins. I'll give that some thought.

Thanks! -- Dan

juanitogan commented 7 years ago

Thanks for looking over my work. Your response is as expected. This was more of an easy way to share code ideas than an actual pull request. Sometimes code tells more than trying to talk it through.

I, too, had considered rewriting the P8SCII file handling to binary and/or byte strings. Semantically, however, I think p8 files really should be classified as P8SCII text files so I enjoyed the challenge of trying to solve it within Python text file handling. Of course, the absolute correct answer here would be to write a P8SCII codec for Python like what others have done with Commodore PETSCII, for example (https://github.com/gitjeff2/legacycodecs).

The more I though about a custom codec, however, the more I realized it would just act a lot like what I was doing already with the iso-8859-1 or ascii codecs -- especially, when the escape-character handling is taken out of the picture. This led to thinking more about 8859 and wondering if I had really closed the book on testing it. When I first tested 8859 it was early on and I hadn't nailed down all the other settings and fixes yet -- or my testing for that matter. Thus, I can't quite remember why it I didn't like it's behavior early on. Anyhow, I went back and retested 8859 with default error handling (strict) and it works fine. Ack. My test files are pretty thorough right now and they like it. Just be to sure, I turned errors='backslashreplace' back on and turned _process_token's \0?? and \x00 processing off to see if the codec was replacing any characters on read. It wasn't.

Thus, it seems to me, iso-8859-1 should be good to go for P8SCII processing without any special consideration, should you want to revert and go back to text. It is just up to you how to handle embedded extended P8SCII chars (❤️) vs "\135" vs "\x87". Probably a user switch to set which of the three forms they all get processed into.

Anyhow, thanks again for this tool and I look forward to seeing your future changes with it. Good luck with the AST work that's ahead of you. Fixing that would be way over my head right now.

dansanderson / picotool

Python34 #12