quickwit-oss / quickwit

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
https://quickwit.io
Other
7.8k stars 315 forks source link

Usage of Path::join creates non-conform index on windows #118

Open evanxg852000 opened 3 years ago

evanxg852000 commented 3 years ago

Describe the bug On MS Windows 10, since the standard path separator is \, usage of Path::join creates indices that cannot be found on other nix* platforms. Step to reproduce:

Steps to reproduce (if applicable) Steps to reproduce the behavior:

  1. Run cargo run -- new --index-uri s3://quickwit-dev/evance/data --no-timestamp-field --doc-mapper-type wikipedia
  2. An index is create on s3 at: s3://quickwit-dev/evance\data\quickwit.json
  3. Trying to access this index via any other command on macos or ubuntu will yield IndexNotFound error.

Expected behavior Index manipulation should be interoperable across all platforms we support

System configuration: Windows 10, Rustup 1.24.3 31/05/2021 Rustc 1.52.1 09/05/2021

evanxg852000 commented 3 years ago

@fmassot @fulmicoton given that windows support / as an alternative directory separator, I think we can have a fix like this https://github.com/quickwit-inc/quickwit/pull/117/files#diff-cc59bc2ec317c814911b2794a6d2f8131d0b3c01943d3a122feac27c84c53b78R42-R45

What do you think? or do you see something that will hunt us for this decision?

fmassot commented 3 years ago

With your solution, you solve the meta path generated on windows for a S3 metastore. What will happen for a meta path generated on windows for a file metastore ?

evanxg852000 commented 3 years ago

Not sure what you mean by path generated. Do we generate the metastore path outside of the cli? The idea is to not rely on the Path.join implementation since that one gives a different separator based on the platform. instead, we use a single know separator. That is / because both platforms support it. This crate also does the same with it's own version of Path and PathBuf. https://crates.io/crates/relative-path.

fmassot commented 3 years ago

Ok I did not know that windows supports / too! Using relative-path seems a good idea but I don't know if it worth it.

for the meta_path function your modification should be ok. But for S3CompatibleObjectStorage.key it take a path in argument and the relative_path.display can break also.

I would remove these modifications from your PR "Testing on other OS & fixing bugs [W.I.P]" so that we can merge it and treat the windows issue later as it is not a priority right now.

evanxg852000 commented 3 years ago

Ok I will do that then

fulmicoton commented 3 years ago

We should probably stop using Path too. Or introduce our own path.

evanxg852000 commented 3 years ago

@fmassot @fulmicoton we need to resolve this as it's currently preventing us from having a Windows build. We could get rid of Path but let's agree since this is part of its public API.

fmassot commented 3 years ago

Remove this issue from Quickwit 0.1 as we won't support windows for this release.

gedw99 commented 2 years ago

I also need Windows btw.

@fmassot FYI - I tend to avoid using relative paths in general - almost an anti pattern, but that me. Maybe specify that only absolute paths are supported in the CLI help.

evanxg852000 commented 2 years ago

@fmassot @fulmicoton So what's the consensus on this? As it stands my guess is we now have more places to cover for this. I think we should switch to a custom Path which is a std::Path wrapper with a custom join method.

guilload commented 2 years ago

Using Path::join is fine, right?

There are two flavors of Path: posix::Path, for UNIX-like systems, and windows::Path, for Windows. The prelude exports the appropriate platform-specific Path variant.

For our case, I believe the issue is using Path::join for URIs which creates URIs like s3://bucket\key. What if we add a Uri::join method that does the right thing depending on the URI protocol and use that instead of Path in the metastore (Metastore::resolve) and storage (Storage::get/put/delete) layers?

fmassot commented 2 years ago

Yes! Now that we have the Uri struct, your solution works perfectly.