castorini / ura-projects

0 stars 1 forks source link

Reranking onboarding... dust off the cobwebs on the repro #8

Open lintool opened 9 months ago

lintool commented 9 months ago

This is our standard reranking onboarding: https://github.com/capreolus-ir/capreolus/blob/feature/msmarco_psg/docs/reproduction/MS_MARCO.md

No one's tried it out for a while... need someone to dust of the cobwebs

And, would be awesome to get working on colab!

lintool commented 9 months ago

Ref: #10

yilinjz commented 9 months ago

Hi Professor @lintool , I tried giving this a repro on colab.

The installation guide I followed (https://capreolus.ai/en/latest/installation.html) didn't work on colab; the issue I encountered was that the following command:
conda env create --name MyCapreolus -f environment.yml
stuck on "Solving Environment" (waited for about an hour but didn't finish). From Google it seemed this issue is because Anaconda was trying to figure out the best way to install the packages and got a conflict caused by outdated package versions.
I also found this installation guide (https://github.com/capreolus-ir/capreolus/blob/feature/msmarco_psg/docs/setup/setup-cc.md) on Compute Canada. Would you like me to try repro this on Compute Canada next or focus on the colab issue?

lintool commented 9 months ago

hey @crystina-z @andrewyates - What do you think?

@yilinjz yes, please proceed to the Compute Canada guide.

To help you understand what exactly it is you're doing, consult Chapter 3 here: https://arxiv.org/abs/2010.06467

What you're reproducing is known as the "monoBERT" model.

yilinjz commented 9 months ago

Got it, I'll dive into the paper. Thanks!

Also I've submitted a "creating a new account" request following (https://github.com/castorini/onboarding/blob/master/docs/cc-guide.md). My username is yilinjz (same as github).

Edit: Should I open a CC account? Or is there a public account for URAs?

andrewyates commented 9 months ago

@yilinjz conda can take a long time (hours) to resolve complicated dependencies. Rather than waiting, I'd suggest using mamba, which is a rewrite of conda that has the same API. After you install it, just replace conda with mamba in the installation commands.

You can install it as part of mambaforge here.

On Colab, the situation is a bit different. You need to use Colab's python, so installing conda/mamba doesn't work. You can use pip instead (pip install -r requirements.txt).

yilinjz commented 9 months ago

Hi Professor Yates @andrewyates, thank you for the tips!

Hi Professor Lin @lintool, as we discussed earlier, I was following Professor Yates's advice on Colab but got stuck on an issue (and hence switched to Compute Canada and got another issue...). I just realized that those two are essentially the same issue.

Here is the error I got on Colab: image which now seems related to a version conflict between python/scipy-stack (https://github.com/capreolus-ir/capreolus/issues/208).