Open GennVa opened 8 months ago
Is it just a version incompatibility because we've pinned transformers to <4.37.0
, or are you able to actually update your transformers
install locally and does everything still work as expected?
Which version of spaCy are you on, if I may ask? Because from 3.7 onwards we've started switching towards https://github.com/explosion/spacy-curated-transformers instead - have you tried it?
@svlandeg I'm using spacy==3.7.3
Can I uninstall spacy-transformers
for spacy-curated-transformers?
Using spacy-transformers==1.3.4
everything seems to work, I just get the version ERROR.
Using spacy-curated-transformers
i have this error:
ValueError: [E002] Can't find factory for 'transformer' for language English (en). This usually happens when spaCy calls `nlp.create_pipe` with a custom component name that's not registered on the current language class. If you're using a custom component, make sure you've added the decorator `@Language.component` (for function components) or `@Language.factory` (for class components).
Available factories: attribute_ruler, tok2vec, merge_noun_chunks, merge_entities, merge_subtokens, token_splitter, doc_cleaner, parser, beam_parser, lemmatizer, trainable_lemmatizer, entity_linker, entity_ruler, tagger, morphologizer, ner, beam_ner, senter, sentencizer, spancat, spancat_singlelabel, span_finder, future_entity_ruler, span_ruler, textcat, textcat_multilabel, en.lemmatizer
Running:
spacy.load(path)
We had to yank 3.7.3 (for unrelated reasons - a bug in the multiprocessing code), so please update to 3.7.4 if you can.
Can I uninstall spacy-transformers for spacy-curated-transformers?
Yes, but you'll then need to use curated_transformer
as the factory instead of just transformer
. You can see an example config here:
[components.transformer]
factory = "curated_transformer"
spacy.load(path)
Which model are you loading? If this is a pretrained model using the old spacy_transformer
's transformer
factory, then you'll still need spacy_transformer
. If it's a pretrained model from us, you can likely update though.
@svlandeg Thanks. I want to train a spancat (with transformers) pipeline. I downloaded spacy-curated-transformers
and spacy==3.7.4
I got this error:
catalogue.RegistryError: [E892] Unknown function registry: 'span_getters'.
Available names: architectures, augmenters, batchers, callbacks, cli, datasets, displacy_colors, factories, initializers, languages, layers, lemmatizers, loggers, lookups, losses, misc, model_loaders, models, ops, optimizers, readers, schedules, scorers, tokenizers, vectors
I used the "This is an auto-generated partial config." on spacy website, but it's for spacy-transformers
only.
I tried to adapt it to spacy-curated-transformers
That's my actual cfg file, used in !python -m spacy init labels mycfg.cfg ...
:
[paths]
train = null
dev = null
vectors = null
init_tok2vec = null
[system]
gpu_allocator = "pytorch"
seed = 0
[nlp]
lang = "en"
pipeline = ["transformer","spancat"]
batch_size = 512
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
vectors = {"@vectors":"spacy.Vectors.v1"}
[components]
[components.spancat]
factory = "spancat"
max_positive = null
scorer = {"@scorers":"spacy.spancat_scorer.v1"}
spans_key = "sc"
threshold = 0.5
[components.spancat.model]
@architectures = "spacy.SpanCategorizer.v1"
[components.spancat.model.reducer]
@layers = "spacy.mean_max_reducer.v1"
hidden_size = 128
[components.spancat.model.scorer]
@layers = "spacy.LinearLogistic.v1"
nO = null
nI = null
[components.spancat.model.tok2vec]
@architectures = "spacy-curated-transformers.TransformerListener.v1"
grad_factor = 1.0
pooling = {"@layers":"reduce_mean.v1"}
upstream = "*"
[components.spancat.suggester]
@misc = "spacy.ngram_suggester.v1"
sizes = [1,2,3]
[components.transformer]
factory = "curated_transformer"
max_batch_items = 4096
set_extra_annotations = {"@annotation_setters":"spacy-curated-transformers.null_annotation_setter.v1"}
[components.transformer.model]
@architectures = "spacy-curated-transformers.RobertaTransformer.v1"
name = "roberta-base"
mixed_precision = false
[components.transformer.model.get_spans]
@span_getters = "spacy-curated-transformers.strided_spans.v1"
window = 128
stride = 96
[components.transformer.model.grad_scaler_config]
[components.transformer.model.tokenizer_config]
use_fast = true
[components.transformer.model.transformer_config]
[corpora]
...other..
I'm using version 1.3.4 of spacy-transformers but it has incompatibility with the latest version of transformers (4.37.2). Is an update planned? Thanks