stanford-futuredata / ColBERT

ColBERT: state-of-the-art neural search (SIGIR'20, TACL'21, NeurIPS'21, NAACL'22, CIKM'22, ACL'23, EMNLP'23)
MIT License
2.95k stars 377 forks source link

How to setup indexing in docker container #212

Closed ghost closed 1 year ago

ghost commented 1 year ago

Thank you for the awesome repo.

I want to build a QA system, and use ColBERT as the index-retrieval model. I find an example of loading the index and performing search in a docker container. Can you provide some suggestion what to do if I want to include the indexing part in the container?

It seems that indexing can not be imported as a module. It needs to be wrapped into "__name__ == __main__" as mentioned https://github.com/stanford-futuredata/ColBERT/issues/93. To call the indexing, I currently use subprocess.run(["python", "index.py"]). But it is not efficient.

Do you have any suggestions?

Thank you.

okhat commented 1 year ago

Why can't indexing be imported in a module? It certainly can. You just need the outermost caller to be inside a "main" if statement.

For instance, you can define a function to do indexing and import it. Then in the importer file, you can have name etc.