Open lintool opened 2 months ago
I initially named the route to match the PrebuiltIndexHandler since all the indexes were from there so in that case option (2) changed to "index" would be an improvement (I'm not sure why I named it "collection" at the start). I also agree that (1) is cleaner, but I'm not sure if it works for BEIR (and correct me if I'm wrong) since I don't think .flat is a model
Using the first option, you can maintain clarity and separation between the corpus and the model, making it easier to add or change models in the future. It also simplifies API usage, as users will understand that the primary resource is the collection, and the model is a configurable parameter. So I side with it, but agreed too that it is a bit of work.
I also agree that (1) is cleaner, but I'm not sure if it works for BEIR (and correct me if I'm wrong) since I don't think .flat is a model
Yea, you'll need another layer of indirection.
I'll make an executive decision of going with (2) for now, with perhaps moving to (1) further down the road... it's probably too much work right now.
@16BitNarwhal can you rename /api/collection/
to /api/index/
and let's wrap this up for now?
https://restfulapi.net/resource-naming/ https://stackoverflow.blog/2020/03/02/best-practices-for-rest-api-design/ Based on the above resources, here are some considerations for renaming the api routes:
/index/{index_name}
but we should change it to /indexes/{index_name}
, since this implies there a collection/group of indexes and we are specifically picking one (this is also a breaking change so ragnarok which relies on the api would also have to change their call)/indexes
should be used to list out all the index information. it maintains consistency as these are the indexes we are querying from /indexes/{index_name}/status
should be use to gauge the status of the index (in this case, it would be a json with the property cached)/api/v1/...
, so breaking changes to the route or api could be versioned and whoever is using the route can continue to use the old ver. or upgrade at their own paceAdding /api/v1/...
is a breaking change, but I like the idea. Let's do v1.0
instead of just v1
.
I'm okay with all the other aspects of your proposal. Let's do it!
Sounds good, will draft and work on #2525
BTW, I don't think we need to preserve the old API, so a breaking change here is fine.
We have different corpus/model combinations. For example, we have
msmarco-v1-passage
,msmarco-v1-passage.splade-pp-ed
, andmsmarco-v1-passage.cos-dpr-distil
. They are all on MS MARCO V1 Passage, but different models - the first for BM25, SPLADE, and cosDPR.For querying different corpus/model combinations, how do we want to configure the REST API route?
Options I see are:
(1) corpus in route, model as query parameter, e.g.,
/api/collection/msmarco-v1-passage/search?query={query}&model=bm25
/api/collection/msmarco-v1-passage/search?query={query}&model=splade-pp-ed
/api/collection/msmarco-v1-passage/search?query={query}&model=cos-dpr-distil
(2) corpus/model (=index) combination directly in route, e.g.,
/api/collection/msmarco-v1-passage/search?query={query}&model=bm25
/api/collection/msmarco-v1-passage.splade-pp-ed/search?query={query}
/api/collection/msmarco-v1-passage.cos-dpr-distil/search?query={query}
The first feels conceptually cleaner, but requires an extra layer of indirection to map into our final indexes.
If we go with option (2), then "collection" is misnamed, but we can change to "index"?