interactive-cookbook / tagger-parser

Tagger and parser models used on our recipes corpus (data), handled with pre- and postprocessing scripts for data conversion (data-conversions)
0 stars 3 forks source link

Error when training the parser with the default configuration #22

Closed danielhers closed 2 years ago

danielhers commented 2 years ago

I am running the training command from the README:

allennlp train parser/parser_config.json -s models

This gives me a TypeError: an integer is required (got type bytes). I could try debugging it but maybe someone already has a solution. Here is the full trace:

Traceback (most recent call last):
  File "/home/daniel/anaconda3/envs/recipe-parser/bin/allennlp", line 5, in <module>
    from allennlp.run import run
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/run.py", line 15, in <module>
    from allennlp.commands import main  # pylint: disable=wrong-import-position
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/commands/__init__.py", line 8, in <module>
    from allennlp.commands.configure import Configure
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/commands/configure.py", line 27, in <module>
    from allennlp.service.config_explorer import make_app
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/service/config_explorer.py", line 24, in <module>
    from allennlp.common.configuration import configure, choices
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/common/configuration.py", line 17, in <module>
    from allennlp.data.dataset_readers import DatasetReader
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/__init__.py", line 1, in <module>
    from allennlp.data.dataset_readers.dataset_reader import DatasetReader
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/dataset_readers/__init__.py", line 10, in <module>
    from allennlp.data.dataset_readers.ccgbank import CcgBankDatasetReader
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/dataset_readers/ccgbank.py", line 9, in <module>
    from allennlp.data.dataset_readers.dataset_reader import DatasetReader
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/dataset_readers/dataset_reader.py", line 8, in <module>
    from allennlp.data.instance import Instance
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/instance.py", line 3, in <module>
    from allennlp.data.fields.field import DataArray, Field
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/fields/__init__.py", line 10, in <module>
    from allennlp.data.fields.knowledge_graph_field import KnowledgeGraphField
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/fields/knowledge_graph_field.py", line 14, in <module>
    from allennlp.data.token_indexers.token_indexer import TokenIndexer, TokenType
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/token_indexers/__init__.py", line 5, in <module>
    from allennlp.data.token_indexers.dep_label_indexer import DepLabelIndexer
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/token_indexers/dep_label_indexer.py", line 8, in <module>
    from allennlp.data.tokenizers.token import Token
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/tokenizers/__init__.py", line 7, in <module>
    from allennlp.data.tokenizers.word_tokenizer import WordTokenizer
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/tokenizers/word_tokenizer.py", line 9, in <module>
    from allennlp.data.tokenizers.word_stemmer import WordStemmer, PassThroughWordStemmer
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/allennlp/data/tokenizers/word_stemmer.py", line 1, in <module>
    from nltk.stem import PorterStemmer as NltkPorterStemmer
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/__init__.py", line 145, in <module>
    from nltk.chunk import *
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/chunk/__init__.py", line 155, in <module>
    from nltk.chunk.api import ChunkParserI
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/chunk/api.py", line 13, in <module>
    from nltk.chunk.util import ChunkScore
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/chunk/util.py", line 12, in <module>
    from nltk.tag.mapping import map_tag
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/tag/__init__.py", line 70, in <module>
    from nltk.tag.sequential import (
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/tag/sequential.py", line 26, in <module>
    from nltk.classify import NaiveBayesClassifier
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/classify/__init__.py", line 97, in <module>
    from nltk.classify.scikitlearn import SklearnClassifier
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/nltk/classify/scikitlearn.py", line 38, in <module>
    from sklearn.feature_extraction import DictVectorizer
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/__init__.py", line 64, in <module>
    from .base import clone
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/base.py", line 14, in <module>
    from .utils.fixes import signature
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/utils/__init__.py", line 14, in <module>
    from . import _joblib
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/utils/_joblib.py", line 22, in <module>
    from ..externals import joblib
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/__init__.py", line 119, in <module>
    from .parallel import Parallel
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/parallel.py", line 28, in <module>
    from ._parallel_backends import (FallbackToBackend, MultiprocessingBackend,
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/_parallel_backends.py", line 22, in <module>
    from .executor import get_memmapping_executor
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/executor.py", line 14, in <module>
    from .externals.loky.reusable_executor import get_reusable_executor
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/externals/loky/__init__.py", line 12, in <module>
    from .backend.reduction import set_loky_pickler
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/externals/loky/backend/reduction.py", line 125, in <module>
    from sklearn.externals.joblib.externals import cloudpickle  # noqa: F401
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/externals/cloudpickle/__init__.py", line 3, in <module>
    from .cloudpickle import *
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py", line 167, in <module>
    _cell_set_template_code = _make_cell_set_template_code()
  File "/home/daniel/anaconda3/envs/recipe-parser/lib/python3.8/site-packages/sklearn/externals/joblib/externals/cloudpickle/cloudpickle.py", line 148, in _make_cell_set_template_code
    return types.CodeType(
TypeError: an integer is required (got type bytes)
danielhers commented 2 years ago

The same thing happens when I try to run allennlp predict models/parser.tar.gz data/English/Parser/test.conllu --use-dataset-reader --output-file out.json with a trained model.

danielhers commented 2 years ago

I fixed the problem by starting a new Conda environment with Python 3.7 instead of 3.8. Apparently scikit-learn==0.20.3 does not support Python 3.8.