allenai / scispacy

A full spaCy pipeline and models for scientific/biomedical documents.
https://allenai.github.io/scispacy/
Apache License 2.0
1.72k stars 229 forks source link

Degradation in en-core-sci-scibert model performance #465

Closed aryehgigi closed 1 year ago

aryehgigi commented 1 year ago

Hey @dakinggg I am working on SPIKE in the AI2-israel team, where we use scispacy's Models We have seen some degradation in the performance of the en-core-sci-scibert model from version 0.4.0 to 0.5.1 (esp. around PP-attachment, conjunction attachments, and noun/adjective mixups)

  1. Do you know anything about this?
  2. Would it be possible to send us the data you trained on so we can try and retain (at least so we have something until a fix will be released)?
  3. Lmk if you want me to attach examples and done stats

Btw are you the maintainer of this repo? Lmk if you want to discuss it offline aryeht@allenai.org

Thanks a lot

dakinggg commented 1 year ago

Responded via email