Losses decrease to 0 within 2000-3000 epochs, f, p, r scores do not go higher than 0 (image 1).
Then I tried a different datasets and got a result like this in image2. But when I tried the old dataset again with the same config, I stayed at 0 again, then I tried a different dataset and I was left at 0 again. I tried different different sizes for ngram_range_suggester but I couldn't find what is wrong!
image 1
image2
How to reproduce the behaviour
Here is the preprocess and config files that I used for training.
preprocess.py for creating .spacy files.
def main(
input_path: Path = typer.Argument(..., exists=True, dir_okay=False),
output_path: Path = typer.Argument(..., dir_okay=False),
):
nlp = spacy.blank("en")
doc_bin = DocBin()
for eg in srsly.read_jsonl(input_path):
doc = nlp.make_doc(eg["text"])
doc.spans['sc'] = []
for s in eg.get("spans", []):
span = doc.char_span(s["start"], s["end"], label=s["label"])
if span:
doc.spans["sc"].append(span)
if doc.spans:
doc_bin.add(doc)
doc_bin.to_disk(output_path)
print(f"Processed {len(doc_bin)} documents: {output_path.name}")
Losses decrease to 0 within 2000-3000 epochs, f, p, r scores do not go higher than 0 (
image 1
).Then I tried a different datasets and got a result like this in
image2
. But when I tried the old dataset again with the same config, I stayed at 0 again, then I tried a different dataset and I was left at 0 again. I tried different different sizes forngram_range_suggester
but I couldn't find what is wrong!image 1
image2
How to reproduce the behaviour
Here is the preprocess and config files that I used for training.
preprocess.py for creating
.spacy
files.Here is my config.cfg
Info about spaCy