wildfly-extras / prospero

Prospero is a tool combining Galleon feature packs and wildfly-channels to provision and update Wildfly server.
Apache License 2.0
9 stars 24 forks source link

JBEAP-27450 Normalize URL notation via URI.toURL().toExternalForm() #737

Closed TomasHofman closed 4 months ago

TomasHofman commented 4 months ago

This went bit farther than just calling URL.toExternalForm(). I tried to simplify the RepositoryDefinition class a bit.

TomasHofman commented 4 months ago

Cc @spyrkob

TomasHofman commented 4 months ago

Thanks @spyrkob for reviewing, I didn't thought of the Windows inputs. (And the Windows jobs did fail, which is good to see that we have coverage!)

TomasHofman commented 4 months ago

@spyrkob I believe I managed to sort out the Windows paths.

I think that we allow a weird mixture of things, while disallowing different things of the same kind.

For instance we have test cases where we test that pure paths (not URLs) are accepted (in this case absolute paths):

https://github.com/wildfly-extras/prospero/blob/9906b3505ebe2ae2a6b6541615b1c0bc4b7c01a6/prospero-cli/src/test/java/org/wildfly/prospero/cli/RepositoryDefinitionTest.java#L145

Different test checks that a non-URL string is refused based on having invalid format:

https://github.com/wildfly-extras/prospero/blob/9906b3505ebe2ae2a6b6541615b1c0bc4b7c01a6/prospero-cli/src/test/java/org/wildfly/prospero/cli/RepositoryDefinitionTest.java#L89

On yet different place we test that the same kind of non-URL string is refused based on the path not being an existing file:

https://github.com/wildfly-extras/prospero/blob/9906b3505ebe2ae2a6b6541615b1c0bc4b7c01a6/prospero-cli/src/test/java/org/wildfly/prospero/cli/RepositoryDefinitionTest.java#L172

From a little different angle, we are inconsistent in that we allow relative paths in URL form ("file:some/path"), but don't allow relative paths in path form ("some/path"), yet we allow absolute paths in path form ("/some/path").

I'm not sure though if we should change that due to backward compatibility. (At least we probably shouldn't make it stricter.)

TomasHofman commented 4 months ago

Updated comments in the code.

spyrkob commented 4 months ago

I think that we allow a weird mixture of things, while disallowing different things of the same kind.

@TomasHofman I agree :( Originally this only allowed URLs, later we got a request to support local files as well, hence this logic changed and got overcomplicated over time. The relative URL itself is an unexpected effect of maven library allowing it, by the time I realised this is possible, we couldn't block it easily due to backwards compatibility (at least in 1.1.x branch).

From a little different angle, we are inconsistent in that we allow relative paths in URL form ("file:some/path"), but don't allow relative paths in path form ("some/path"), yet we allow absolute paths in path form ("/some/path").

We should allow relative paths some/path as long as the file exists, this sounds like a bug

Different test checks that a non-URL string is refused based on having invalid format:

That's an incorrect test - probably hasn't been removed when support of local files has been added. Also that shows that the check in the test is not enough, the test should check the reason why the value is removed

The summary of allowed values is:

  1. A http/https URL
  2. An absolute, local file URL pointing to an existing folder
  3. A relative, local file URL pointing to an existing folder (probably can deprecate and drop this in future)
  4. An absolute local path, pointing to an existing folder
  5. A relative, local path pointing to an existing folder

On top of that, each of this can have an ID suffix <ID>:: - though I'd like to deprecate and drop that in future.

TomasHofman commented 4 months ago

I think I still need to handle a:foo/bar case.

spyrkob commented 4 months ago

I think I still need to handle a:foo/bar case.

yes of course, sorry 🤦

TomasHofman commented 4 months ago

When you do Path.of("a:foo/bar") it gives you relative path on Linux, but a valid absolute path on Windows, so it's gonna be platform specific. Check the test case. (It's bit weird but the test has to be tailored for the platform too.)

I tried to do a comparison to the original impl and I do get mostly the same results, except that the current impl allows some notations that threw exceptions before, like:

They do not seem to be very sensible, so we could just explicitly forbid any remote URLs (two slashes) in the file schema. We could also enforce only http and https for the remote URLs.

Then we have the option to throw all this away and keep the original impl, just do the bare minimum to prefer the one slash notation over the three slashes notation...

spyrkob commented 4 months ago

Thanks for the detailed summary @TomasHofman!

They do not seem to be very sensible, so we could just explicitly forbid any remote URLs (two slashes) in the file schema. We could also enforce only http and https for the remote URLs.

+1 to forbidding the remote file URLs especially if we already blocked that before. I'm thinking more and more that we should have supported http/https URLs and local file paths (without file:schema) only.

TomasHofman commented 4 months ago

Added new commit to try to enforce http / https / file schemas, and file only for local URLs (no host).