embeddings-benchmark / mteb

MTEB: Massive Text Embedding Benchmark
https://arxiv.org/abs/2210.07316
Apache License 2.0
1.81k stars 240 forks source link

Avoid try except when raise_errors = True #1035

Open KennethEnevoldsen opened 2 months ago

KennethEnevoldsen commented 2 months ago

Currently when debugging code you only get an error after it is raised in the except clause in the main try except block in MTEB.run. This is problematic as you don't have access to the context of where the bug occurred. A solution to this is to instead only use the try except block afterwards:

# what I am proposing:
if raise_errors:
  run_eval(...)
else:
  try:
    run_eval(...)
  except:
    ...

# What it is now:
try:
  run_eval(...)
except Exception as e:
  if raise_errors:
     raise e # hard to debug as you at not at the place where the error happened.

@Muennighoff and @isaac-chung would love your guys opinion on this before I implement anything

This e.g. makes #1022 really hard to debug.

isaac-chung commented 2 months ago

But the traceback should also show what other exceptions are raised. I'm not sure if I follow what the benefits are.

For #1022 the relevant part of MTEB code seems to be in:

File "/data/niklas/mteb/mteb/abstasks/AbsTaskRetrieval.py", line 231, in load_data
corpus, queries, qrels = HFDataLoader(
File "/data/niklas/mteb/mteb/abstasks/AbsTaskRetrieval.py", line 96, in load
self._load_qrels(split)
File "/data/niklas/mteb/mteb/abstasks/AbsTaskRetrieval.py", line 175, in _load_qrels
qrels_ds = load_dataset(

I've often added a line above the faulty line for debugging using pdb: import pdb; pdb.set_trace(), then you can type the variables and hit enter to see their values. Hitting n runs the next line, and c continues running the code.

KennethEnevoldsen commented 1 week ago

realized I never answered this. The benefit is that if a debugger is used it will stop when the error is raised not when the error happens.