pisa-engine / pypisa

A Python interface to the PISA IR engine
Apache License 2.0
3 stars 1 forks source link

Search API #3

Open elshize opened 3 years ago

elshize commented 3 years ago

Still work in progress

elshize commented 3 years ago

@JMMackenzie can you try checking out this branch and running tox? Idk why I'm getting Python.h: No such file or directory... Same happens for python setup.py install. Can you check if you're getting the same?

elshize commented 3 years ago

Actually nvm, I think I was missing the devel python package...

elshize commented 3 years ago

Pushed some code, but I can't get it to work yet. I get bad_alloc of all things... I tried debugging but something wierd was happening that I didn't understand. @JMMackenzie I'll try again in a spare moment, but if you have any ideas about what i messed up, let me know.

JMMackenzie commented 3 years ago

Pushed some code, but I can't get it to work yet. I get bad_alloc of all things... I tried debugging but something wierd was happening that I didn't understand. @JMMackenzie I'll try again in a spare moment, but if you have any ideas about what i messed up, let me know.

Thanks for this. Could you briefly show how you're building/testing? I just want to make sure we're doing the same thing. No rush, thanks!

elshize commented 3 years ago

I just run python setup.py install --user (you can use virtualenv or whatever you like), and then try running it. Here's an example script you might run:

import pypisa

results = pypisa.search(
    "data/robust04.block_simdbp.idx",
    "block_simdbp",
    ["103"],
    "maxscore",
    10,
    "data/robust04.bmw",
    False,
    "bm25",
    "data/robust04.termlex",
    "porter2"
)
print(results)

I used the Robus04-queries-only index from https://github.com/osirrc/ciff/, converted it with ciff tool, then compressed it and created lexicon with pypisa.

JMMackenzie commented 3 years ago

Just a quick note for debugging as discussed earlier. The query processor seems to be losing the scorer resulting in a segfault so we may need to alter the lambda captures in resolve_query_processor.hpp to avoid this.

JMMackenzie commented 3 years ago

Quick update. It seems that the problem could be caused by the ScorerParams getting destroyed (somehow) before it gets into the resolve_query_processor call. You can access the members of that struct inside the engine::processor(...) call correctly, and that function actually calls resolve_query_processor, so it is somehow getting lost through there.

JMMackenzie commented 3 years ago

And final update for now: passing ScorerParams through as a reference seems to work but then I get a bad_alloc :joy: So that's enough for today...