raspberrypi / pico-sdk

BSD 3-Clause "New" or "Revised" License
3.62k stars 902 forks source link

pico-examples pico_w httpd build fails on Windows -- makefsdata.py is generating invalid variable name #1793

Open ndabas opened 1 month ago

ndabas commented 1 month ago

When trying to do a full build of pico-examples on Windows, it fails on the httpd example with:

In file included from C:/Program Files/Raspberry Pi/Pico SDK v2.0.0/pico-sdk/lib/lwip/src/apps/http/fs.c:39:
P:/pico-examples/build/pico_w/wifi/httpd/generated/pico_fsdata.inc:199:36: error: stray '\' in program
  199 | static const unsigned char data_img\rpi_png[] = {
      |                                    ^

So there's a \ in the variable name.

The Python script that's generating this file is assuming UNIX-style path separators.

The fix would be to simply use os.sep instead of "/". Let me know if this sounds reasonable and I can open a PR for it.

peterharperuk commented 4 weeks ago

The code tries to replace invalid characters with "_". I guess it needs to do this for backslash as well as forward slash?

    # make a variable name
    var_name = str(file.relative_to(input_dir))
    var_name = var_name.replace(".", "_")
    var_name = var_name.replace("/", "_")
    data_var = f"data_{var_name}"
    file_var = f"file_{var_name}"

Apologies, I thought I tested this on Windows. PR's are welcome.

ndabas commented 4 weeks ago

Thanks for looking into this.

I guess it needs to do this for backslash as well as forward slash?

Yes, like I mentioned in my comment, the easy way would be to just use os.sep as the search token. However, since these are supposed to be valid variable names, there are lots of other characters which are valid in filenames but not in variable names. So maybe we can make it a bit more foolproof by substituting something like r"[^a-zA-Z_\d]+" in the string with _ to get valid characters?

In addition -- after replacement, in some cases, we might end up with the same variable names, so we'll need to track and maybe add a suffix there? For example, if you have files named a.txt and a_txt, the current code will generate duplicate variable names like this.

(There are other cases too, like variable names cannot start with numbers, or reserved words cannot be used -- but those are already mitigated by prefixing data_/file_ to the var names.)

Let me know if this idea sounds fine and I will create a PR for it.

lurch commented 4 weeks ago

So maybe we can make it a bit more foolproof by substituting something like r"[^a-zA-Z_\d]+" in the string with _ to get valid characters?

...which is exactly equivalent to r"\W" ? https://docs.python.org/3/library/re.html#regular-expression-syntax :wink:

ndabas commented 4 weeks ago

...which is exactly equivalent to r"\W" ? https://docs.python.org/3/library/re.html#regular-expression-syntax 😉

Yep, that would be a good idea, I personally stay away from character classes which morph based on the locale in use -- but I guess r"\W+" with the re.ASCII flag will do nicely in this case.