Open agolo-alan-hogue opened 1 year ago
I learned that this is a change in numpy 1.24, so I tried downgrading it (instead of editing things as above), which yields an apparently unrelated error:
root@dbf92a2ec323:/GENRE# pip install --upgrade numpy==1.23.5
Collecting numpy==1.23.5
Downloading numpy-1.23.5-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (14.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.0/14.0 MB 68.0 MB/s eta 0:00:00
Installing collected packages: numpy
Attempting uninstall: numpy
Found existing installation: numpy 1.24.4
Uninstalling numpy-1.24.4:
Successfully uninstalled numpy-1.24.4
Successfully installed numpy-1.23.5
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.0.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
root@dbf92a2ec323:/GENRE# pip show numpy
Name: numpy
Version: 1.23.5
Summary: NumPy is the fundamental package for array computing with Python.
Home-page: https://www.numpy.org
Author: Travis E. Oliphant et al.
Author-email:
License: BSD
Location: /usr/local/lib/python3.8/site-packages
Requires:
Required-by: blis, fairseq, sacrebleu, spacy, thinc
root@dbf92a2ec323:/GENRE# python3 main.py --yago -i example_article.jsonl \
-o out.jsonl --split_iter --mention_trie data/mention_trie.pkl \
--mention_to_candidates_dict data/mention_to_candidates_dict.pkl
Traceback (most recent call last):
File "main.py", line 3, in <module>
from model import Model
File "/GENRE/model.py", line 6, in <module>
from genre.fairseq_model import GENRE
File "/GENRE/genre/fairseq_model.py", line 15, in <module>
from fairseq.models.bart import BARTHubInterface, BARTModel
File "/GENRE/fairseq/fairseq/models/__init__.py", line 208, in <module>
module = importlib.import_module("fairseq.models." + model_name)
File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/GENRE/fairseq/fairseq/models/wav2vec/__init__.py", line 6, in <module>
from .wav2vec import * # noqa
File "/GENRE/fairseq/fairseq/models/wav2vec/wav2vec.py", line 25, in <module>
from fairseq.tasks import FairseqTask
File "/GENRE/fairseq/fairseq/tasks/__init__.py", line 15, in <module>
from .fairseq_task import FairseqTask, LegacyFairseqTask # noqa
File "/GENRE/fairseq/fairseq/tasks/fairseq_task.py", line 13, in <module>
from fairseq import metrics, search, tokenizer, utils
ImportError: cannot import name 'metrics' from 'fairseq' (unknown location)
...and finally tried this:
export PYTHONPATH=$PYTHONPATH:/GENRE/fairseq
It seems like it is going to work, but:
root@dbf92a2ec323:/GENRE# export PYTHONPATH=$PYTHONPATH:/GENRE/fairseq
root@dbf92a2ec323:/GENRE# python3 main.py --yago -i example_article.jsonl -o out.jsonl --split_iter --mention_trie data/mention_trie.pkl --mention_to_candidates_dict data/mention_to_candidates_dict.pkl
load model...
1042301B [00:00, 10179472.72B/s]
456318B [00:00, 7585473.82B/s]
load data/mention_trie.pkl...
Killed
I should have enough disk space and memory so I am not sure yet what is causing this. Looks like the file is half a gb. I have 16gb of ram.
Hello,
I set everything up on an ubuntu machine with docker. Followed all instructions, everything went fine. At linking time I get the same error, shown below.
sudo docker run --rm -v $PWD/data:/GENRE/data -v $PWD/models:/GENRE/models -it genre bash
root@4646a2f8bd9c:/GENRE# python3 main.py --yago -i data/agolo_v2_beta.benchmark.jsonl \
-o genre-agolo-linked-articles.jsonl --split_iter --mention_trie data/mention_trie.pkl \
--mention_to_candidates_dict data/mention_to_candidates_dict.pkl
Traceback (most recent call last):
File "main.py", line 3, in <module>
from model import Model
File "/GENRE/model.py", line 6, in <module>
from genre.fairseq_model import GENRE
File "/GENRE/genre/fairseq_model.py", line 14, in <module>
from fairseq import search, utils
File "/GENRE/fairseq/fairseq/utils.py", line 20, in <module>
from fairseq.modules.multihead_attention import MultiheadAttention
File "/GENRE/fairseq/fairseq/modules/__init__.py", line 10, in <module>
from .character_token_embedder import CharacterTokenEmbedder
File "/GENRE/fairseq/fairseq/modules/character_token_embedder.py", line 11, in <module>
from fairseq.data import Dictionary
File "/GENRE/fairseq/fairseq/data/__init__.py", line 23, in <module>
from .indexed_dataset import (
File "/GENRE/fairseq/fairseq/data/indexed_dataset.py", line 112, in <module>
6: np.float,
File "/usr/local/lib/python3.8/site-packages/numpy/__init__.py", line 305, in __getattr__
raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'float'.
`np.float` was a deprecated alias for the builtin `float`. To avoid this error in existing code, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
Under the same circumstances, downgrading numpy gives again the same results:
# python -m pip install numpy==1.19.5
Collecting numpy==1.19.5
Downloading numpy-1.19.5-cp38-cp38-manylinux2010_x86_64.whl (14.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 14.9/14.9 MB 13.5 MB/s eta 0:00:00
Installing collected packages: numpy
Attempting uninstall: numpy
Found existing installation: numpy 1.24.4
Uninstalling numpy-1.24.4:
Successfully uninstalled numpy-1.24.4
Successfully installed numpy-1.19.5
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
[notice] A new release of pip is available: 23.0.1 -> 23.3.1
[notice] To update, run: pip install --upgrade pip
root@4646a2f8bd9c:/GENRE# python3 main.py --yago -i data/agolo_v2_beta.benchmark.jsonl -o genre-agolo-linked-articles.jsonl --split_iter --mention_trie data/mention_trie.pkl --mention_to_candidates_dict data/mention_to_candidates_dict.pkl
Traceback (most recent call last):
File "main.py", line 3, in <module>
from model import Model
File "/GENRE/model.py", line 6, in <module>
from genre.fairseq_model import GENRE
File "/GENRE/genre/fairseq_model.py", line 15, in <module>
from fairseq.models.bart import BARTHubInterface, BARTModel
File "/GENRE/fairseq/fairseq/models/__init__.py", line 208, in <module>
module = importlib.import_module("fairseq.models." + model_name)
File "/usr/local/lib/python3.8/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "/GENRE/fairseq/fairseq/models/wav2vec/__init__.py", line 6, in <module>
from .wav2vec import * # noqa
File "/GENRE/fairseq/fairseq/models/wav2vec/wav2vec.py", line 25, in <module>
from fairseq.tasks import FairseqTask
File "/GENRE/fairseq/fairseq/tasks/__init__.py", line 15, in <module>
from .fairseq_task import FairseqTask, LegacyFairseqTask # noqa
File "/GENRE/fairseq/fairseq/tasks/fairseq_task.py", line 13, in <module>
from fairseq import metrics, search, tokenizer, utils
ImportError: cannot import name 'metrics' from 'fairseq' (unknown location)
And again the same thing, this time on an ubuntu machine with twice as much ram:
# python3 main.py --yago -i data/agolo_v2_beta.benchmark.jsonl -o genre-agolo-linked-articles.jsonl --split_iter --mention_trie data/mention_trie.pkl --mention_to_candidates_dict data/mention_to_candidates_dict.pkl
load model...
1042301B [00:00, 27201737.57B/s]
456318B [00:00, 13431321.23B/s]
load data/mention_trie.pkl...
Killed
Same with alternate data:
# python3 main.py --yago -i data/agolo_v2_beta.benchmark.jsonl -o genre-agolo-linked-articles.jsonl --split_iter --mention_trie data/mention_trie.dalab.pkl --mention_to_candidates_dict data/mention_to_candidates_dict.dalab.pkl
load model...
load data/mention_trie.dalab.pkl...
load data/mention_to_candidates_dict.dalab.pkl...
Killed
Hi, I had the same numpy problems (originating from fairseq) which you described. For me everything is working now with python 3.8.12 when I run the following command for the installation of the requirements:
torch pytest requests spacy gdown fairseq
I assume there is no need anymore to install the fairseq clone from Nicola de Cao since this PR was merged in the official repository: https://github.com/facebookresearch/fairseq/pull/3276
Regarding RAM requirements: Loading the model and tries for me takes around 20GB.
Hello,
I got GENRE set up from your repo, but when I try to run it, I get this:
I manually replaced
np.float
with justfloat
, which happens only infairseq
.That gives another error. I'll paste that one below. So I went to the normal GENRE repo and found this note regarding
fairseq
here:fairseq>=0.10 (optional for training GENRE) NOTE: fairseq is going though changing without backward compatibility. Install fairseq from source and use [this](https://github.com/nicola-decao/fairseq/tree/fixing_prefix_allowed_tokens_fn) commit for reproducibilty. See https://github.com/pytorch/fairseq/pull/3276 for the current PR that should fix fairseq/master.
So it sounds to me like if you were to pull or rebase this PR maybe this would fix these problems?
Thanks for your help!
Latest error: