dlang / dub

Package and build management system for D
MIT License
673 stars 230 forks source link

Properly fix issue #2691 by reverting to the original scanning behavior #2865

Closed s-ludwig closed 7 months ago

s-ludwig commented 7 months ago

Reverts to the scanning/warning behavior that was in place before the warnings for the new package directory structure were implemented.

This reverts #2780.

github-actions[bot] commented 7 months ago

✅ PR OK, no changes in deprecations or warnings

Total deprecations: 0

Total warnings: 0

Build statistics:

 statistics (-before, +after)
 executable size=5368480 bin/dub
 rough build time=62s
Full build output ``` DUB version 1.35.1, built on Jan 6 2024 LDC - the LLVM D compiler (1.36.0): based on DMD v2.106.1 and LLVM 17.0.6 built with LDC - the LLVM D compiler (1.36.0) Default target: x86_64-unknown-linux-gnu Host CPU: znver3 http://dlang.org - http://wiki.dlang.org/LDC Registered Targets: aarch64 - AArch64 (little endian) aarch64_32 - AArch64 (little endian ILP32) aarch64_be - AArch64 (big endian) amdgcn - AMD GCN GPUs arm - ARM arm64 - ARM64 (little endian) arm64_32 - ARM64 (little endian ILP32) armeb - ARM (big endian) avr - Atmel AVR Microcontroller bpf - BPF (host endian) bpfeb - BPF (big endian) bpfel - BPF (little endian) hexagon - Hexagon lanai - Lanai loongarch32 - 32-bit LoongArch loongarch64 - 64-bit LoongArch mips - MIPS (32-bit big endian) mips64 - MIPS (64-bit big endian) mips64el - MIPS (64-bit little endian) mipsel - MIPS (32-bit little endian) msp430 - MSP430 [experimental] nvptx - NVIDIA PTX 32-bit nvptx64 - NVIDIA PTX 64-bit ppc32 - PowerPC 32 ppc32le - PowerPC 32 LE ppc64 - PowerPC 64 ppc64le - PowerPC 64 LE r600 - AMD GPUs HD2XXX-HD6XXX riscv32 - 32-bit RISC-V riscv64 - 64-bit RISC-V sparc - Sparc sparcel - Sparc LE sparcv9 - Sparc V9 spirv32 - SPIR-V 32-bit spirv64 - SPIR-V 64-bit systemz - SystemZ thumb - Thumb thumbeb - Thumb (big endian) ve - VE wasm32 - WebAssembly 32-bit wasm64 - WebAssembly 64-bit x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 xcore - XCore Upgrading project in /home/runner/work/dub/dub/ Starting Performing "release" build using /opt/hostedtoolcache/dc/ldc2-1.36.0/x64/ldc2-1.36.0-linux-x86_64/bin/ldc2 for x86_64. Building dub 1.36.0+commit.102.g0a1b8db9: building configuration [application] Linking dub STAT:statistics (-before, +after) STAT:executable size=5368480 bin/dub STAT:rough build time=62s ```
s-ludwig commented 7 months ago

Maybe this should go into stable?

Geod24 commented 7 months ago

@s-ludwig : If we go with the add-path description you put here: https://github.com/dlang/dub/pull/2780#issuecomment-1949205527

It is meant to pick up your cloned working copies of projects that are for example in your "dev" folder.

Why would one use add-path for this as opposed to path dependencies ?

Also add-path is a major footgun. If you are using it with version/branch dependencies, you need to know that you have to add the undocumented version field in your package recipe for it to be picked up. If you don't, it will default to master, so your project versions needs to specify master in which case you would be better off with a path dependency.

Another approach we could take is that packages in add-path actually match any version. We have a use-case for this at Symmetry. But having it work just like another location has too many gotchas.

s-ludwig commented 7 months ago

@s-ludwig : If we go with the add-path description you put here: #2780 (comment)

It is meant to pick up your cloned working copies of projects that are for example in your "dev" folder.

Why would one use add-path for this as opposed to path dependencies ?

Because add-path doesn't require you to make any changes to the packages, no matter if you are working with cached packages, checked out versions, or where exactly you want to check them out.

Also add-path is a major footgun. If you are using it with version/branch dependencies, you need to know that you have to add the undocumented version field in your package recipe for it to be picked up. If you don't, it will default to master, so your project versions needs to specify master in which case you would be better off with a path dependency.

No, that's wrong. It will query the git branch and latest tag to determine a version. Since add-path directories have precedence over cached packages, this allows you to work seamlessly on a version branch, add new commits, while dependent packages still pick it up when building.

Another approach we could take is that packages in add-path actually match any version. We have a use-case for this at Symmetry. But having it work just like another location has too many gotchas.

I'm using the existing approach (as well as anyone I'm working together with) since a long time and it's absolutely vital to my workflow. It simply works as expected.

By the way, if someone doesn't use add-path, there is no reason why that would have to impact package lookup performance, right? As far as I see it, this feature is really a completely isolated use case that got conflated with the cache directory purely coincidentally, based on the fact that it got implemented with the same logic. In perspective, the code for handling cached packages should simply be moved out to a different class/struct as soon as the forward probing for the new structure gets implemented.

Geod24 commented 7 months ago

It will query the git branch and latest tag to determine a version. Since add-path directories have precedence over cached packages, this allows you to work seamlessly on a version branch, add new commits, while dependent packages still pick it up when building.

Only because add-path-sourced package have their own matching scheme that is different (laxer) than the regular matching scheme. And that only works if the repository is under git version control. Someone that expects add-path to "just add a path" that would behave as their $HOME/.dub/packages/ would hit some surprising behaviors.

This doesn't mean the feature is not valuable, obviously you're obviously a prime user of it, but it has its limitations. Another one that comes to mind immediately is that you would have to ensure, as a user, that your package is checked out under the right directory name if you're working on packages that leak outside their directory (arsd, ae). It would work, but that's another potential issue you have to be aware of.

By the way, if someone doesn't use add-path, there is no reason why that would have to impact package lookup performance, right? As far as I see it, this feature is really a completely isolated use case that got conflated with the cache directory purely coincidentally, based on the fact that it got implemented with the same logic. In perspective, the code for handling cached packages should simply be moved out to a different class/struct as soon as the forward probing for the new structure gets implemented.

Correct, I was actually looking at that this week.

s-ludwig commented 7 months ago

Only because add-path-sourced package have their own matching scheme that is different (laxer) than the regular matching scheme. And that only works if the repository is under git version control. Someone that expects add-path to "just add a path" that would behave as their $HOME/.dub/packages/ would hit some surprising behaviors.

This is literally the way this is meant to be used. Requiring the use of SCM checkouts could be a change that might make sense, since I can't imaging that anyone uses it with plain copies of packages (since, as you mentioned, that doesn't work anyway due to the missing version field).

But please recognize that independent of the functional arguments, we are talking about a change that was introduced on the stable branch, on a stable version track, modifying test cases, without proper discussion and without proper review. IMHO, it doesn't fit a project at this stage and exposure for things to be handled this way.

Another one that comes to mind immediately is that you would have to ensure, as a user, that your package is checked out under the right directory name if you're working on packages that leak outside their directory (arsd, ae). It would work, but that's another potential issue you have to be aware of.

That is a problem with the package setup and has nothing to do with this feature, you'll have to make sure to get that right no matter how you make it visible to DUB.

Geod24 commented 7 months ago

This is literally the way this is meant to be used.

It makes sense with your explanation, this was not obvious to me before.

since I can't imaging that anyone uses it with plain copies of packages

If you look at the tests - this is exactly what they do! I think that was an abuse of the feature used at the time the tests were written, but that is part of why I got the impression that this was only about adding "yet another path". Also note that I hadn't changed all the tests using this feature - there are too many, and I want to move some / most of them to use the new test framework (which creates a virtual filesystem in memory, making writing tests and running them much easier).

But please recognize that independent of the functional arguments, we are talking about a change that was introduced on the stable branch, on a stable version track, modifying test cases, without proper discussion and without proper review. IMHO, it doesn't fit a project at this stage and exposure for things to be handled this way.

I want to highlight that the change didn't remove any functionality. Using the age-tested structure still works but serves you a deprecation message / warning. All it did was make it work with the new structure. The test would pass before or after the change, I just wanted to reduce the number of deprecations the test suite was throwing.

That is a problem with the package setup and has nothing to do with this feature, you'll have to make sure to get that right no matter how you make it visible to DUB.

Even so, it's a footgun. The more baked-in assumption we make in how a feature is used, the most likely it is to be misused.

Note that my messages were not intended to challenge the need for a change in the direction in that PR. I just wanted to build a better understanding so I know what are the requirements around this feature in the future. I will try to review this ASAP.

s-ludwig commented 7 months ago

Fair enough, I do recognize that this feature is not well communicated as it stands and I should probably have done something about that a long time ago.

I want to highlight that the change didn't remove any functionality. Using the age-tested structure still works but serves you a deprecation message / warning. All it did was make it work with the new structure. The test would pass before or after the change, I just wanted to reduce the number of deprecations the test suite was throwing.

My biggest problem is a) that changing the tests like that manifests the wrong approach, making it difficult to recognize and correct later, and b) that now a harmless bug (spurious warning) is turned into either a regression (when it gets reverted later) or introduces a use(case)less (and error prone*) feature that needs to stay supported for a long time. But mainly all I'm saying is that the way this got introduced into the code base is rather unfortunate, even though I'm generally all for reducing unneeded friction in the development process.

Even so, it's a footgun. The more baked-in assumption we make in how a feature is used, the most likely it is to be misused.

As far as I see it, the fact that an import path is set outside of the repository/package directory and the reliance on a particular name for folder where the code resides is the real issue/footgun here. I can understand how this is both a historical issue and how it is a nice setup for the respective authors of those packages. However, for a package that is originally intended for public consumption (which was not the case here, AFAIK), where it is uncommon to integrate packages using git submodule, it would be a rather strange/problematic choice. For that reason I'm certainly a little biased against giving this use case more priority than necessary - its more legacy support than an endorsed use case. For internal use of such packages, going the git submodule route sounds like a completely viable solution, though.

By the way, if my wording came across as harsh at some point, that was not my intention. This particular feature just happens to repeatedly fall victim to collateral damage, which admittedly is quite annoying and should ideally be prohibited by proper test cases. I will put writing better documentation and test cases on my TODO list, to at least have done my part.

* On my setup, I had a bunch of errors being printed, because some repositories have sub folders that match the NAME/VERSION/NAME pattern and it tries to load those sub folders as packages. So this part also actually is an additional regression over the previous behavior.

WebFreak001 commented 7 months ago

anything to improve on the documentation from this PR?

WebFreak001 commented 7 months ago

e.g. in

s-ludwig commented 7 months ago

@WebFreak001 https://dub.pm/dub-reference/dependencies/#dub-add-path looks like a good place for the detailed description, and I'd probably put a link to that page in the CLI documentation of dub add-path (plus remove the mention of the "version" field.