olofk / fusesoc

Package manager and build abstraction tool for FPGA/ASIC development
BSD 2-Clause "Simplified" License
1.17k stars 242 forks source link

Allow Python to choose generator interpreter using shutil parameter t… #647

Closed cr1901 closed 1 year ago

cr1901 commented 1 year ago

…o generators.

Rationale is that I was running fusesoc from a virtual environment on Windows, and needed fusesoc to call python to generate a core for me.

Unfortunately, due to some wonderful design choices in Windows that I don't really understand, even with the PATH correctly set in the virtual environment, fusesoc was invoking my system python, instead of the virtual environment one. This caused my generators using a new amaranth feature to fail because I don't have the corresponding commits on the amaranth installed on my system (and checking out that version and pip install -e . just papers over what I think is a path search issue).

shutil is also apparently the appropriate way to work around path issues like mine. See the warning in the Popen constructor.

olofk commented 1 year ago

Interesting design choice indeed. This looks like a great improvement. One question though. Do we ever need the old behavior or can we just skip the shutil parameter and always use the new mode?

cr1901 commented 1 year ago

Do we ever need the old behavior or can we just skip the shutil parameter and always use the new mode?

I would be quite happy with skipping the shutil parameter and always using the new mode if you're okay with it. I just suspect that changing the path search algorithm from the system to Python is technically a breaking change. But if you're fine w/ that I will update the PR accordingly.

Also: Is a change breaking if no one depends on the behavior? It is probably fine to change the behavior and make a note that Python's shutil.which may differ from the system in obscure ways. FWIW, SearchPathW does return my system Python even within a virtual environment w/ the PATH correctly set.

olofk commented 1 year ago

Could you just quickly explain what would be the difference so that I don't misunderstand? If I understood correctly it would only affect Windows users, and would use the binary found with where before it search in current directory, windows system dirs, etc. If so, then I think we can count it as a bug fix. Even if it's technically breaks backwards-incompatiblity, the old behavior was never the intended one. And I actually wonder if this could explain some other weird bug reports I have seen over the years.

cr1901 commented 1 year ago

Yea, I kinda minddumped just to make sure I didn't forget to write stuff down. I've rearranged it into sections to hopefully make it more readable.

General Problem

Could you just quickly explain what would be the difference so that I don't misunderstand? If I understood correctly it would only affect Windows users, and would use the binary found with shutil.which before it search in current directory, windows system dirs, etc.

This is correct, w/ my bold tweaks. I want to clarify that shutil.which uses different, Python-specific logic from the system to find where a binary lives. The system/OS decides what binary to find/use when Popen is called without a path, and shutil.which will provide an absolute path. On Windows, the decision on which binary the kernel decides to use can be queried using SearchPathW. I don't know how to query the same information on POSIX, so I don't know for sure whether shutil.which and a *nix kernel binary search can mismatch or not. But... see "Bug Fix" section; shutil.which can't hurt.

My Case

On my Windows system, the kernel's decision for which python binary does not match shutil.which("python")'s decision. This includes inside a virtual environment where my PATH variable is correctly set to favor the virtual environment python. But if you asked me to explain why the don't match, I don't have a great answer, and I've kinda lost interest :P. I defer to the issue I linked for future me to figure it out.

Let's Call This A Bug Fix

If so, then I think we can count it as a bug fix. Even if it's technically breaks backwards-incompatiblity, the old behavior was never the intended one. And I actually wonder if this could explain some other weird bug reports I have seen over the years.

In both Windows and POSIX cases, the Popen constructor recommends documentation explicitly suggests using shutil.which to avoid the mismatches I describe above. So yea, I can justify "this was never intended to work in the first place" as a reason to (minimally :D) break back-compat.

olofk commented 1 year ago

All good then. Just switch over unconditionally to the shutil behavior and I'll merge it. It's a bug fix!

cr1901 commented 1 year ago

Force-pushed (and tested on my end). Idk what the lint/pre-commit failure is from. Maybe it's transient? RtD build I assume you already know about and it'll be fixed when you get the chance :P.

olofk commented 1 year ago

All good. Thanks for the fix. I found the cause of the linting failure. It's because I added a core description file with a yaml syntax error to check that FuseSoC handles that correctly and pre-commit picks it up as a real error. Doh!

olofk commented 1 year ago

Yep. Fixed the lint issue. Thank you for your contributions. Picked and pushed!