HazyResearch / bootleg

Self-Supervision for Named Entity Disambiguation at the Tail
http://hazyresearch.stanford.edu/bootleg
Apache License 2.0
212 stars 27 forks source link

Installation error #97

Closed hitercs closed 2 years ago

hitercs commented 2 years ago

Hi,

I am trying to install the latest Bootleg via

git clone git@github.com:HazyResearch/bootleg bootleg
cd bootleg
python3 setup.py install

After that, when I run the annotation-on-the-fly-example, it failed due to missing many packages, e.g., urllib3, idna, charset_normalizer, etc.

Could you please help to check the current installation guide?

Thanks a lot.

lorr1 commented 2 years ago

Hello,

Thanks for bringing this up. Did you see any errors with the setup.py command? Perhaps something wasn't installed correctly which prevented the rest of the commands from going through? Any log dump will be great for me to start debuggin.

hitercs commented 2 years ago

Hi @lorr1,

The errors I got with the setup.py command is:


... Installed /anaconda/envs/bootleg2/lib/python3.8/site-packages/threadpoolctl-3.0.0-py3.8.egg Searching for joblib>=0.11 Reading https://pypi.org/simple/joblib/ Downloading https://files.pythonhosted.org/packages/3e/d5/0163eb0cfa0b673aa4fe1cd3ea9d8a81ea0f32e50807b0c295871e4aab2e/joblib-1.1.0-py2.py3-none-any.whl#sha256=f21f109b3c7ff9d95f8387f752d0d9c34a02aa2f7060c2135f465da0e5160ff6 Best match: joblib 1.1.0 Processing joblib-1.1.0-py2.py3-none-any.whl Installing joblib-1.1.0-py2.py3-none-any.whl to /anaconda/envs/bootleg2/lib/python3.8/site-packages Adding joblib 1.1.0 to easy-install.pth file

Installed /anaconda/envs/bootleg2/lib/python3.8/site-packages/joblib-1.1.0-py3.8.egg error: typing-extensions 4.0.0 is installed but typing-extensions<4.0.0,>=3.7.4 is required by {'rich'}


Thanks!

lorr1 commented 2 years ago

Thanks! I just pushed a small change to setup.py on the master branch that fixed it for me. Let me know if you still have issues but try to pull and reinstall.

hitercs commented 2 years ago

Hi @lorr1,

Thanks! I pull the changes and reinstall. But it shows another error:


error: numpy 1.21.4 is installed but numpy<1.21,>=1.17 is required by {'numba'}


I use python 3.8

lorr1 commented 2 years ago

Did you try a new environment first?

hitercs commented 2 years ago

Yes, I've double-checked it and installed it on another machine. Both show the error message:


error: numpy 1.22.0rc1 is installed but numpy<1.21,>=1.17 is required by {'numba'}


lorr1 commented 2 years ago

I'm sorry about this. Try again. I just pushed a fix for the numpy requirements

hitercs commented 2 years ago

Hi @lorr1, thanks. I successfully install it. However, when I run the annotation-on-the-fly-example,

# Load new annotator with our config - notice how it does have to reprep some things
from bootleg.end2end.bootleg_annotator import BootlegAnnotator

# You can also pass `return_embs=True` to get the embeddings
ann = BootlegAnnotator(
    config=config_args, device=device, return_embs=False, verbose=False
)

It throws an ImportError:


bootleg/bootleg/tasks/ned_task.py in 3 import torch.nn.functional as F 4 from emmental.scorer import Scorer ----> 5 from emmental.task import Action, EmmentalTask 6 from torch import nn 7 from transformers import AutoModel

ImportError: cannot import name 'Action' from 'emmental.task' (/anaconda/envs/bootleg/lib/python3.8/site-packages/emmental-0.0.9-py3.8.egg/emmental/task.py)


lorr1 commented 2 years ago

Hey. That's an emmental version issue. We are using a prerelease version of their master branch. Our setup.py was working w.r.t. this, but now it's not. I've contacted the repo owner to see if we can get the version released soon. In the meantime, if you run this

pip install git+https://git@github.com/senwu/emmental.git@master

After setup.py install, it should give you emmental 1.0.0 dev, which is the right version.

hitercs commented 2 years ago

@lorr1 Thanks. I finally install it successfully after running

pip install git+https://git@github.com/senwu/emmental.git@master

Then when I run

# Load new annotator with our config - notice how it does have to reprep some things
from bootleg.end2end.bootleg_annotator import BootlegAnnotator

# You can also pass `return_embs=True` to get the embeddings
ann = BootlegAnnotator(
    config=config_args, device=device, return_embs=False, verbose=False
)

It raises an error message:


/bootleg/symbols/kg_symbols.py in load_from_cache(cls, load_dir, prefix, edit_mode, verbose)

156             first_rel = next(iter(qid2relations[first_qid].keys()))
157             if re.match("^P[0-9]+$", first_rel):

--> 158 raise ValueError( 159 "Your qid2relations dict has a relation as a PID identifier. Please replace " 160 "with human readable strings for training. "

ValueError: Your qid2relations dict has a relation as a PID identifier. Please replace with human readable strings for training. See https://www.wikidata.org/wiki/Wikidata:Database_reports/List_of_properties/all


I checked the qid2relations.json file, it is in the format:

qid2relations['Q31'] ->

{'P1344': ['Q1088364'], 'P1151': ['Q3247091'], 'P1546': ['Q1308013'], 'P5125': ['Q7112200'], 'P38': ['Q4916', 'Q232415'], 'P1792': ['Q7021332'], 'P2852': ['Q1061257', 'Q25648793', 'Q25648794', 'Q25648798'], 'P2853': ['Q1378312', 'Q2335536'], 'P2633': ['Q1115035'], 'P1313': ['Q213107'], 'P417': ['Q128267'], 'P17': ['Q31'], .... }

lorr1 commented 2 years ago

So that's an older version of the entity_db data. What aws link did you use? The latest version of the data should not have PID values in there?

hitercs commented 2 years ago

Hi @lorr1, thanks. The aws link in the readme file works. I used the link pointed in the annotation-on-the-fly which may be outdated.