explosion / spaCy

💫 Industrial-strength Natural Language Processing (NLP) in Python
https://spacy.io
MIT License
30.09k stars 4.4k forks source link

Bulgarian language (`--lang bg`) causes trouble in `spacy init config` #7217

Closed mrshu closed 3 years ago

mrshu commented 3 years ago

How to reproduce the behaviour

When I run the command

python -m spacy init config base_config.cfg --lang bg --pipeline ner --optimize accuracy --gpu -F

I get the following

Traceback (most recent call last):
  File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/dist-packages/spacy/__main__.py", line 4, in <module>
    setup_cli()
  File "/usr/local/lib/python3.7/dist-packages/spacy/cli/_util.py", line 68, in setup_cli
    command(prog_name=COMMAND)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/dist-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/typer/main.py", line 497, in wrapper
    return callback(**use_params)  # type: ignore
  File "/usr/local/lib/python3.7/dist-packages/spacy/cli/init_config.py", line 62, in init_config_cli
    silent=is_stdout,
  File "/usr/local/lib/python3.7/dist-packages/spacy/cli/init_config.py", line 165, in init_config
    base_template = template.render(variables).strip()
  File "/usr/local/lib/python3.7/dist-packages/jinja2/environment.py", line 1090, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.7/dist-packages/jinja2/environment.py", line 832, in handle_exception
    reraise(*rewrite_traceback_stack(source=source))
  File "/usr/local/lib/python3.7/dist-packages/jinja2/_compat.py", line 28, in reraise
    raise value.with_traceback(tb)
  File "<template>", line 32, in top-level template code
  File "/usr/local/lib/python3.7/dist-packages/jinja2/environment.py", line 452, in getitem
    return obj[argument]
jinja2.exceptions.UndefinedError: 'None' has no attribute 'accuracy'

You can reproduce this behavior in the following Google Colab:

https://colab.research.google.com/drive/1ma9Bm0n1F_deS_suAy3t2lN466FKQmqo

Info about spaCy

adrianeboyd commented 3 years ago

Thanks for the report! This is due to a formatting error in the quickstart recommendations. This should be fixed in the next release, v3.0.4, and hopefully a bit sooner on the website quickstart.

If you want to fix your local installation, you can add the edits from #7222 to spacy/cli/templates/quickstart_training_recommendations.yml wherever you have spacy installed.

Or if it's easier, you can choose another language like en to generate the config and then edit nlp.lang and components.transformers.model.name to the transformer model you want (the default is iarfmoose/roberta-base-bulgarian; we've tried to test that the recommend models run, but we're not familiar with all the models/resources for all languages, so there may be a better model to choose here).

mrshu commented 3 years ago

Thanks for the quick response @adrianeboyd!

I actually did manage to get the model running and trained using the exact same approach as you suggested so I don't think there is any great urgency here.

I also see that the "(allegedly) human readable format" was at fault once again -- if I end up writing an article about YAML (on the order of https://noyaml.com), this will almost certainly make it in there.

From my point of view this issue no longer persists so please feel free to close the issue if that's helpful for you.

Thanks again!

github-actions[bot] commented 3 years ago

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.