oss-review-toolkit / ort

A suite of tools to automate software compliance checks.
https://oss-review-toolkit.org
Apache License 2.0
1.56k stars 306 forks source link

Analyzer does not allow to have multiple independent projects with the same type / name / version #8498

Open sschuberth opened 4 months ago

sschuberth commented 4 months ago

See

https://github.com/oss-review-toolkit/ort/blob/6fca2678dc7bb9ddcd93244a46e3655b24ed519b/analyzer/src/main/kotlin/AnalyzerResultBuilder.kt#L57-L59

The same occurs when analyzing e.g. https://github.com/aws/glide-for-redis.git as it contains multiple (independent) Cargo.toml files wit the same content, like

$ head -5 go/Cargo.toml
[package]
name = "glide-rs"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"

$ head -5 java/Cargo.toml
[package]
name = "glide-rs"
version = "0.1.0"
edition = "2021"
license = "Apache-2.0"
sschuberth commented 4 months ago

@oss-review-toolkit/core-devs, how about if we simply add parent directory names as suffixes to the project name until the is unique?

mnonnenmacher commented 4 months ago

A couple of questions:

Do you propose this as a general solution or specific to Cargo?

Why add the directory names as suffixes and not prefixes? That seems unintuitive. I would rather prefix them and always take the full path, as it could otherwise be confusing. So for the example above use the names java/glide-rs and csharp/lib/glide-rs.

Should this always happen or only if there are conflicting names?

sschuberth commented 4 months ago

Do you propose this as a general solution or specific to Cargo?

As a general solution, see also the PIP case mentioned in the quoted TODO.

Why add the directory names as suffixes and not prefixes?

Because at the Cargo example, I find glide-rs-go / glide-rs-java to read nicer than go-glide-rs / java-glide-rs. (I probably should have said that I envisioned dashes instead of slashes as separators.)

Should this always happen or only if there are conflicting names?

Probably yes, as otherwise names could get unnecessary complicated.

mnonnenmacher commented 4 months ago

I kind of like this approach, however I'm not sure about the details. For example, this approach could be difficult for package managers that support project dependencies (e.g. Maven), because those references might break if we rename projects. Could you maybe collect some more examples to show how the naming algorithm would work for repositories that are affected by this issue? That would be good input to further refine the idea.

heliocastro commented 4 months ago

Some insights here, are we aiming to a common global identification ? Or if this is too much, maybe instead of dash could go to something like gradle representations:

glide-for-redis.go.glide-rs:0.1.0

Is this a little more logic considering that we have a better tracking from exact folder

fviernau commented 4 months ago

I kind of like this approach, however I'm not sure about the details. For example, this approach could be difficult for package managers that support project dependencies (e.g. Maven), because those references might break if we rename projects.

IIRC in GoMod it could be analog.