quickwit-oss / quickwit

Cloud-native search engine for observability. An open-source alternative to Datadog, Elasticsearch, Loki, and Tempo.
https://quickwit.io
Other
8.22k stars 336 forks source link

Publish to cargo #296

Open fulmicoton opened 3 years ago

fulmicoton commented 3 years ago

We need to publish quickwit to crates.io.

Now that artillery-core has been published the only road block is our tantivy fork. We can publish it as tantivy-quickwit if necessary, or use a git submodule.

It will however require to publish all of the new subcrate (fastfield_codec) for instance. It probably would be nicer to prefix all of the tantivy specific crate with "tantivy-".

@PSeitz what do you think?

PSeitz commented 3 years ago

Is there something else blocking the usage of the original tantivy except the blocked term-dicitonary?

Ideally if the fork is just temporary we would bundle tantivy-quickwit into quickwit without publishing anything to crates.io, but I don't think that's supported?

It will however require to publish all of the new subcrate (fastfield_codec) for instance. It probably would be nicer to prefix all of the tantivy specific crate with "tantivy-".

Yes a prefix would be good. They would also have their own release cycle in quickwit, which isn't great.

Do we need to publish to crates.io currently, additionally to the installer script?

fulmicoton commented 3 years ago

As far as I know it is not supported if it is a separate crate.

Do we need to publish to crates.io currently, additionally to the installer script?

It's always nice for rustacean to use cargo install quickwit

PSeitz commented 3 years ago

It's always nice for rustacean to use cargo install quickwit

That's true, but in that case I would prefer to delay until we use original tantivy, since it's a more efficient use of our time and the installer should also work fine.

fmassot commented 3 years ago

I agree with @PSeitz but I don't really know the amount of work to be able to use the original tantivy. @PSeitz do you have an idea on that?

PSeitz commented 3 years ago

The blocked term dictionary would mean to move the sstable to tantivy and have a layer to address both implementations. fuzzy queries requiring a full scan should probably be disabled. Not sure if merging is implemented for the current sstable. Not sure if there are more differences.

fmassot commented 3 years ago

You can look at the diff here: https://github.com/tantivy-search/tantivy/pull/1115/files The async part should be easy to backport so the main work concerns the sstable. For fuzzy queries, we are currently using tantivy grammar which does not include this type of query, the only type of query which we needed to disable was RangeQuery.