MCUdude / MegaCoreX

An Arduino hardware package for ATmega4809, ATmega4808, ATmega3209, ATmega3208, ATmega1609, ATmega1608, ATmega809 and ATmega808
GNU Lesser General Public License v2.1
239 stars 47 forks source link

Prevent corruption of disassembler listing command for paths w/ spaces #159

Closed per1234 closed 1 year ago

per1234 commented 1 year ago

MegaCoreX uses a build hook to save a disassembler listing to the build folder when compiling sketches.

Due to the use of a redirection operator in the command, it was necessary to execute it via the Windows command interpreter by passing it as an argument to cmd /c.

Although paths with spaces can be used without difficulty in pattern generated commands executed directly by the Arduino framework, things are more complicated when it comes to executing commands indirectly via cmd /c.

The cmd /c command has an odd handling of quotes:

https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/cmd#remarks

Under certain conditions, the first and last quotation mark of the command passed via the argument to /c are stripped.

These conditions occur for the command generated via the platform's recipe.hooks.objcopy.postobjcopy.1.pattern.windows property when the both the path to avr-objdump and the temporary build path contain a space.

The removal of the essential quotes from the start and end of the command executed by cmd /c causes an error. for example, if the avr-objdump path was:

C:\Users\username with spaces\AppData\Local\\Arduino15\packages\arduino\tools\avr-gcc\7.3.0-atmel3.6.1-arduino7\bin\avr-objdump

The command will fail with the error:

'C:\Users\username' is not recognized as an internal or external command, operable program or batch file.

One of the required conditions for the unwanted quote stripping is that the first character of the argument to cmd /c is a double quote. So this unwanted behavior can be avoided by ensuring the first character is not a double quote. That is accomplished here by adding a dummy command (adds a blank line to the output, but the most suitable "no operation" candidate I found) that does not require quoting at the start of the argument.

Demo of bug

$ ./arduino-cli version
arduino-cli.exe  Version: 0.27.1 Commit: a900cfb2 Date: 2022-09-18T15:57:51Z

$ export ARDUINO_DIRECTORIES_DATA="/tmp/arduino-cli-directories/has space/data" # Cause avr-objdump to be installed under a path with spaces

$ ./arduino-cli core install MegaCoreX:megaavr --additional-urls https://mcudude.github.io/MegaCoreX/package_MCUdude_MegaCoreX_index.json

[...]

$ ./arduino-cli sketch new "/tmp/FooSketch"

Sketch created in: C:\Users\per\AppData\Local\Temp\FooSketch

$ ./arduino-cli compile --fqbn MegaCoreX:megaavr:4809 --verbose --build-path "/tmp/arduino-cli-directories/has space/build" "/tmp/FooSketch"

[...]

cmd /C "C:\\Users\\per\\AppData\\Local\\Temp\\arduino-cli-directories\\has space\\data\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-objdump" --disassemble --source --line-numbers --demangle --section=.text "C:\\Users\\per\\AppData\\Local\\Temp\\arduino-cli-directories\\has space\\build/FooSketch.ino.elf" > "C:\\Users\\per\\AppData\\Local\\Temp\\arduino-cli-directories\\has space\\build/FooSketch.ino_atmega4809_16000000L.lst"
'C:\Users\per\AppData\Local\Temp\arduino-cli-directories\has' is not recognized as an internal or external command,
operable program or batch file.

[...]

Additional information

Alternatives considered

It might seem that a less hacky approach would be to simply wrap the entire argument to cmd /c in an extra layer of sacrificial quotes, leaving the essential quotes intact in the command that is finally executed. While that does indeed work when running the command directly from the command line, it does not work when used in a pattern in the Arduino platform configuration framework. The problem is that the string generated from the platform pattern can not be executed as is. It must be split into arguments. The list of arguments space separated, with quoting where an argument contains a space.

These arguments may contain quote characters, which must be escaped. This escaping is where the simple approach of extra quotes becomes unusable. The os/exec Go package used to handle the escaping does that in a way that is compatible with directly executing a command, but is not compatible with the legacy escaping system used by cmd:

https://pkg.go.dev/os/exec#Command

On Windows, processes receive the whole command line as a single string and do their own parsing. Command combines and quotes Args into a command line string with an algorithm compatible with applications using CommandLineToArgvW (which is the most common way). Notable exceptions are msiexec.exe and cmd.exe (and thus, all batch files), which have a different unquoting algorithm. In these or other similar cases, you can do the quoting yourself and provide the full command line in SysProcAttr.CmdLine, leaving Args empty.

Scope of bug

It seems this problem of the command having paths with spaces did not occur in Arduino IDE 1.x due to it using a Windows "8.3" format for the default build path, which does not contain spaces. Arduino CLI (and thus Arduino IDE 2.x) uses the full path format instead. However, the bug could also occur in Arduino IDE 1.x if the user sets a custom build.path with spaces.

recipe.hooks.savehex.presavehex.2.pattern.windows also uses cmd /C. However, this command is not subject to the quote stripping because it does not ever have a double quote as the first character of the argument.

Related

Originally reported at https://forum.arduino.cc/t/failure-with-ide-2-0-and-megacorex/1032918

MCUdude commented 1 year ago

Thanks for the PR and detailed explanation! Will this PR cause any issues with IDE 1.8.x? I'm not using a Windows computer myself, so It's difficult for me to properly test this.

per1234 commented 1 year ago

Will this PR cause any issues with IDE 1.8.x?

No. Here are some successful compilations for a board of the patched version of MegaCoreX with Arduino IDE 1.8.19:

No spaces in paths:

cmd /C echo. && "C:\\Users\\per\\AppData\\Local\\Arduino15\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-objdump" --disassemble --source --line-numbers --demangle --section=.text "C:\\Users\\per\\AppData\\Local\\Temp\\arduino_build_812948/sketch_sep26a.ino.elf" > "C:\\Users\\per\\AppData\\Local\\Temp\\arduino_build_812948/sketch_sep26a.ino_atmega4809_16000000L.lst"

"C:\\Users\\per\\AppData\\Local\\Arduino15\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-size" -A "C:\\Users\\per\\AppData\\Local\\Temp\\arduino_build_812948/sketch_sep26a.ino.elf"
Sketch uses 470 bytes (0%) of program storage space. Maximum is 49152 bytes.
Global variables use 4 bytes (0%) of dynamic memory, leaving 6140 bytes for local variables. Maximum is 6144 bytes.

Space in compiler.path:

cmd /C echo. && "E:\\arduino\\ide\\arduino 1.8.19\\portable\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-objdump" --disassemble --source --line-numbers --demangle --section=.text "C:\\Users\\per\\AppData\\Local\\Temp\\arduino_build_432591/sketch_sep26a.ino.elf" > "C:\\Users\\per\\AppData\\Local\\Temp\\arduino_build_432591/sketch_sep26a.ino_atmega4809_16000000L.lst"

"E:\\arduino\\ide\\arduino 1.8.19\\portable\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-size" -A "C:\\Users\\per\\AppData\\Local\\Temp\\arduino_build_432591/sketch_sep26a.ino.elf"
Sketch uses 470 bytes (0%) of program storage space. Maximum is 49152 bytes.
Global variables use 4 bytes (0%) of dynamic memory, leaving 6140 bytes for local variables. Maximum is 6144 bytes.

Spaces in compiler.path and build.path

cmd /C echo. && "E:\\arduino\\ide\\arduino 1.8.19\\portable\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-objdump" --disassemble --source --line-numbers --demangle --section=.text "e:\\deleteme\\build path/sketch_sep26a.ino.elf" > "e:\\deleteme\\build path/sketch_sep26a.ino_atmega4809_16000000L.lst"

"E:\\arduino\\ide\\arduino 1.8.19\\portable\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-size" -A "e:\\deleteme\\build path/sketch_sep26a.ino.elf"
Sketch uses 470 bytes (0%) of program storage space. Maximum is 49152 bytes.
Global variables use 4 bytes (0%) of dynamic memory, leaving 6140 bytes for local variables. Maximum is 6144 bytes.

As I mentioned in the PR description, the bug did not occur when using Arduino IDE 1.x because it seems it uses the 8.3 format for the path of the system temporary folder in the default build.path, as you can see in this forum post (note the RICHAR~1):

cmd /C "C:\\Users\\Richard Marantz\\AppData\\Local\\Arduino15\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-objdump" --disassemble --source --line-numbers --demangle --section=.text "C:\\Users\\RICHAR~1\\AppData\\Local\\Temp\\arduino_build_534053/sketch_sep17b.ino.elf" > "C:\\Users\\RICHAR~1\\AppData\\Local\\Temp\\arduino_build_534053/sketch_sep17b.ino_atmega4809_16000000L.lst"

However, you can see that it is even possible to have spaces in build.path when the user sets a custom value either via preferences.txt or the Arduino IDE 1.x command line, as I did in the last output I shared above. If I compile for a board of the unpatched MegaCoreX board with Arduino IDE 1.8.19 in that environment the same error occurs:

cmd /C "E:\\arduino\\ide\\arduino 1.8.19\\portable\\packages\\arduino\\tools\\avr-gcc\\7.3.0-atmel3.6.1-arduino7/bin/avr-objdump" --disassemble --source --line-numbers --demangle --section=.text "e:\\deleteme\\build path/sketch_sep26a.ino.elf" > "e:\\deleteme\\build path/sketch_sep26a.ino_atmega4809_16000000L.lst"
'E:\arduino\ide\arduino' is not recognized as an internal or external command,
operable program or batch file.
exit status 1
Error compiling for board ATmega4809.

So even though it is more likely to occur with Arduino CLI or Arduino IDE 2.x, it is not specific to those tools. They are both using the same Windows command interpreter that is the origin of the problem, and even the inapplicability of the potential alternative solution I mentioned remains the same because the quoting is handled by the same Go package even in Arduino IDE 1.x (via arduino-builder). I chose Arduino CLI for the demo only because it makes it very simple to produce a command that contains spaces in the paths even for those of use who aren't so brave as to use spaces in our Windows user names.

MCUdude commented 1 year ago

Thanks for your detailed description and contribution! I'll assume this affects my other cores as well?

per1234 commented 1 year ago

You are welcome.

this affects my other cores as well?

That is correct. It also affects the @SpenceKonde cores.

Would you like me to submit the same fix for the others? I figured it was best to start with a single PR so any discussion and review could be focused in one place, and picked this one since it was the origin of the report.

MCUdude commented 1 year ago

Would you like me to submit the same fix for the others?

If you have time, that would be great! I'm a bit busy at the moment, so PRs for the other cores are more than welcome!

SpenceKonde commented 1 year ago

So what do I need to do to fix this?

It sounds like the fix is that everywhere I do cmd /C in platform.txt followed by a quoted string, I need to change that to cmd /C echo. && - correct? (I see 4 instances of it - once each to save the assembly listing and memory map, and then again once for each to copy the two files into the sketch folder, with the menu settings encoded in the file name (that way, you can compare assembly listings for two different parts or submenu settings without having to rename one of the files, which gets very awkward if you're switching back and forth tracking down a bug), and also makes it easier to spot user error (where they have a menu option set that has consequences they don't expect)

per1234 commented 1 year ago

everywhere I do cmd /C in platform.txt followed by a quoted string, I need to change that to cmd /C echo. && - correct?

That is correct. Even though the mechanism behind the bug is fairly complex (at least to me) and obscure, fortunately the fix/workaround is quite simple and innocuous.

Since the reason for the echo. && is not at all clear, it might be worth also adding an explanatory comment as I did in this PR.