beir-cellar / beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
http://beir.ai
Apache License 2.0
1.54k stars 182 forks source link

fix: typing issue raising warnings in mypy and pylance #118

Open qherreros opened 1 year ago

qherreros commented 1 year ago

Hello,

Currently, static type checker are raising multiple warnings because of the use of Type[] is some places of the code. My understanding is that Type[] refers to the type of the type and not the type of the object, which explains warnings.

a = 3         # Has type 'int'
b = int       # Has type 'Type[int]'
c = type(a)   # Also has type 'Type[int]'

It's a very small problem that doesn't impact anything, apart from type checker. Thanks for you work.

nathan-chappell commented 1 year ago

There are a number of type related issues. For instance, while the annotation for DenseRetrievalExactSearch.search calls for top_k: List[int], this will fail if you try to pass it a list of ints. In fact, the EvaluateRetrieval class will convert your List[int] into an int for you.

All the issues that I've encountered while using beir seem to indicate that these issues are immediately obvious using standard tools.

Is this a library that the original developers wish to maintain and keep active, or is this more of an exposition of the original paper? A tool like this could be quite useful in general, but it would need to be properly maintained. I don't necessarily mind putting in some effort to make fixes, but not if the repo owners don't care that much.

thakur-nandan commented 1 year ago

Hi @nathan-chappell,

It has been a busy past few months in my PhD and my part-time internship, so sadly I haven't gotten time to get back to recent changes for the BEIR repository.

I do very much care about the repository and I try to maintain the BEIR repository as much as I can. I'm currently the only developer and the repository has been in a stale state for the past 6-8 months I would say. I would appreciate the help, if you can propose a PR by introducing the type errors.

I will check in your PR to fix the type issues.

nathan-chappell commented 1 year ago

@thakur-nandan Thanks for the prompt response. I'm currently using the repo for "sanity checking" some in-house IR systems. Once I've implemented this, I'll propose some fixes or changes if I think they are reasonable.

Good luck with your PhD!